CN118170979A

CN118170979A - Learning path recommendation method based on multi-concept learning resources

Info

Publication number: CN118170979A
Application number: CN202410318122.9A
Authority: CN
Inventors: 赵斌; 薛景涛; 吉根林
Original assignee: Nanjing Normal University
Current assignee: Nanjing Normal University
Priority date: 2024-03-20
Filing date: 2024-03-20
Publication date: 2024-06-11

Abstract

The invention discloses a learning path recommending method based on multi-concept learning resources, which comprises the following steps: inputting the history learning record to Encoder module for processing to obtain learning record matrix; inputting the learning record matrix and the adjacent matrix into a T-GCN model in a Decoder module to acquire a knowledge level; the softmax layer, which inputs the knowledge level into the Decoder module, outputs probability distributions that different concepts need to be learned; inputting the probability distribution into a candidate concept generation layer in a candidate set generation module to generate a concept candidate set; inputting the concept candidate set into a candidate resource generation layer in a candidate set generation module to generate a learning resource candidate set; acquiring a learning path according to the learning resource candidate set; the knowledge tracking module is utilized to select the best learning path. The learning path recommendation method comprises the step of recommending a learning path containing multi-concept learning resources according with individual requirements for learners, so that the learners can obtain better learning effects and can be applied to wider real scenes.

Description

Learning path recommendation method based on multi-concept learning resources

Technical Field

The invention belongs to the field of intelligent education, relates to learning path recommendation technology, and particularly relates to a learning path recommendation method based on multi-concept learning resources.

Background

With the rapid development of large-scale open online courses (MOOCs), online educational platforms have made autonomous learning feasible. The instructional services provided by MOOC platforms such as Coursera and edX provide a number of high quality educational resources to learners that assist them in achieving educational objectives. However, the richness of the resources may also lead to problems for the learner facing cognitive overload and learning to become disoriented. Therefore, learning route recommendation attracts a great deal of attention in academia, and becomes a focus in the field of intelligent education research.

The learning path recommendation aims at searching for suitable learning resources for the learner and arranging these resources into an ordered sequence. Previous studies can be categorized into macroscopic and microscopic views based on differences in ordered objects. The early research mainly adopts a traditional recommendation algorithm to match interest resources for learners at a macroscopic resource level, and a learning sequence is constructed to meet the demands of the learners. Along with the introduction of knowledge tracking technology (Knowledge Tracing), the recommendation model can evaluate the mastery degree of learners on a microscopic concept level, so as to recommend a more accurate and effective learning path to improve the learning effect. Recent researches such as Liu and Chen and the like carry out concept sequencing through knowledge tracking feedback, assist a recommendation model, optimize a concept sequence by combining a reinforcement learning framework, and remarkably improve learning effect. However, microscopic-view ordering objects are limited to learning resources that contain a single concept, limiting the application of learning path recommendations in a broader learning scenario.

The single concept recommendation model has limitations when applied to a real learning scenario that contains multiple concept learning resources. For example, in the widely used Ednet published dataset, learning resources containing multiple concepts account for nearly 50%. To accommodate the single concept learning path recommendation method, the multi-concept resources are decomposed into a plurality of single concept resources, thereby simplifying a complex learning scenario into a single concept scenario. This simplification has two main drawbacks: first, the model loses the possibility of recommending integrated multi-concept resources. Assuming that the learner wishes to improve performance on mathematical concepts C1 (integer) and C2 (real number), the model may recommend a learning path that includes single concept resources R1 and R2, but cannot recommend multi-concept resource R3 that integrates both concepts, although this is typically the actual need of the learner. Second, the model fails to take advantage of unique inter-concept relationships in multi-concept resources. For example, the simplification of the multi-concept resource R3 ignores the co-occurrence relationship between C1 and C2, and the lack of such information may result in poor effectiveness of the recommended learned path.

Therefore, the existing learning path recommendation method can provide a learning resource sequence corresponding to a single concept for a learner to form a learning path. However, a large number of learning resources corresponding to multiple concepts exist in a real learning scene, the single-concept learning path recommendation method causes the multiple-concept learning resources to lose the possibility of being recommended, and meanwhile, a learner is difficult to realize the capability of integrating multiple concepts through the multiple-concept learning resources, and the learning path recommendation method is limited. Therefore, it is important to solve the problem of learning path recommendation in the multi-concept scene.

Disclosure of Invention

The invention aims to: in order to solve the problem of learning path recommendation under a multi-concept scene in the prior art, a learning path recommendation method based on multi-concept learning resources is provided, and learning paths containing the multi-concept learning resources meeting individual requirements are recommended for learners.

The technical scheme is as follows: in order to achieve the above object, the present invention provides a learning path recommending method based on a multi-concept learning resource, comprising the steps of:

S1: inputting the history learning record to Encoder module for processing to obtain learning record matrix;

S2: inputting the learning record matrix and the adjacent matrix into a T-GCN model in a Decoder module to acquire a knowledge level;

s3: the softmax layer, which inputs the knowledge level into the Decoder module, outputs probability distributions that different concepts need to be learned;

S4: inputting the probability distribution in the step S3 into a candidate concept generation layer in a candidate set generation module to generate a concept candidate set;

inputting the concept candidate set into a candidate resource generation layer in a candidate set generation module to generate a learning resource candidate set;

acquiring a learning path according to the learning resource candidate set;

S5: the best learning path is selected using a pre-trained knowledge tracking module.

Further, in the step S1, the learning record matrix is obtained by: firstly, preprocessing a history learning record through an embedding layer to obtain feature matrixes at different moments, and then performing dimension reduction operation on the feature matrixes at different moments by using a self-attention mechanism to obtain a learning record matrix.

Further, the specific process of preprocessing the history learning record through the embedding layer in the step S1 is as follows:

The history learning Record is constructed into a format which can be calculated with the feature matrix, namely a history Record matrix Record of N x K; wherein N represents the number of knowledge concepts, and K represents the total number of histories of learners; each row represents the answer condition of a question of a learner, the knowledge concept corresponding to the question is represented by 1 or 0, and the rest positions are 0; if the feature matrix at time t ₀ is Fe ₀, the feature matrix at time t _i is calculated by:

Feⁱ＝p(Fe_i-1,Record_i-1),0<i<K

Wherein, record _i-1 represents the i-1 th row of Record matrix, function p represents the i-1 th row of Record matrix, and is added to the corresponding answer situation row of Fe _i-1 matrix, namely the first M rows of Fe matrix represent the corresponding information of the problem, and the latter M rows represent the answer situation; thus, the feature matrix at different moments is obtained:

FEs＝{F₀,…,Fe_K-1}。

Further, in the step S1, the specific process of performing the dimension reduction operation on the feature matrix at different moments by using the self-attention mechanism to obtain the learning record matrix is as follows:

all the matrixes in the F are respectively processed as follows:

Namely, fe _i is reduced in dimension into a vector of N x h, wherein h can be adjusted according to the actual training process to represent the dimension of the hidden layer; finally constructing learning record matrix I.e. K (K) matrices, and obtaining the adjacency matrix Adj pre-processed in the preamble.

Further, the T-GCN model in the step S2 comprises a GCN model and a GRU model, wherein the GCN model is used for modeling the knowledge level of a learner at a certain moment, the GRU model is used for analyzing the change rule of the knowledge level of the learner when the learner learns different learning resources, the T-GCN model tightly combines the GCN with the GRU, and the knowledge level of the learner is updated continuously according to the learning record of the learner.

Further, the operation method of the GCN model in step S2 is as follows:

adjacency matrix Adj (N x N) and learning record matrix Inputting the global feature into a GCN model, wherein the model acts on nodes in the graph, capturing the spatial features among the nodes through a first-order neighborhood of the model, and then enabling the nodes to acquire the global features through overlapping a plurality of convolution layers, wherein the global features are expressed as follows:

Wherein, Is a contiguous matrix (K.times.N) with self-connection, I _N is a unitary matrix,/>Is a degree matrix (K.times.N),/>The addition of the degree matrix enables the learner's knowledge level to be updated with the weight without acquiring additional supervision information; ho ^l is the output of the layer, θ ^l contains the parameters of the layer, and σ (-) represents the sigmoid function of the nonlinear model;

the choice of a 2-layer GCN model to mine inter-concept dependencies can be expressed as:

Wherein, Representing the preprocessing step, W ₀∈R^(K*h)×Hid1 represents the weight matrix from the input to the hidden layer, K is the length of the feature matrix, hid1 is the number of hidden units, W ₁∈R^Hid1×Hid2 represents the weight matrix from the hidden layer to the output layer,/>The knowledge level vector dimension representing the learner is NxHid 2, reLU (·) is the active layer in the deep neural network;

The GRU model operates in the following modes: the GRU takes the hidden state at the time t-1 as the current knowledge level input, and obtains the knowledge level at the time t.

Further, the calculation process of the T-GCN model in the step S2 is as follows:

h_t＝u_t*h_t-1+(1-u_t)*c_t

Where h _t-1 denotes the output at time t-1, GC is the graph convolution process, u _t and r _t denote the update and reset gates at time t, h _t denotes the output at time t, A diagram convolution process is shown.

Further, in the step S3, the hidden layer state h _t output by the GRU is input as the knowledge level S to the softmax layer in the Decoder module to output the probability distribution that different concepts need to be learned, specifically:

The initial knowledge level S ₀ is expressed as:

S₀＝h₀

Wherein h ₀ represents an initial hidden layer state;

The knowledge level, the learning resources and the target concepts are jointly introduced to calculate a score of each knowledge concept in the current step: assuming that after learning the i-1 th learning resource is completed, the corresponding learning resource is r _i-1, let r _<i represent learning resource sequence r _<i＝{r₁,r₂,…,r_i-1 }, and the corresponding knowledge concept set Each concept C appears only once in C _index, where Core (r _index)＝c₁ represents a center concept and C ₂,c₃ represents a concept adjacent to the center concept; defining a learned set of center concepts c_<i＝{Core(r₁),Core(r₂),…,Core(r_i-1)}＝{c₁,c₂,c₃,…}, outputs the concept probability distribution of step i through the softmax layer:

Where j=1, 2, …, N (total number of concepts); Representing the characteristic representation of the j-th concept obtained by the encoder; e _T represents fusion of concept embedding in the target concept T; w ₁,W₁,W₂,W₃ and b are learnable weights or matrices;

Wherein, Representing the concept to be selected in step i, c _j representing the concept number j,/>Calculating indexes of concept distribution (softmax operation), wherein C is the whole concept set;

sampling at the location to obtain knowledge concepts selected at a certain time

Further, the step S4 specifically includes:

Random sampling at candidate concept generation layer to obtain a candidate concept The adjacent concepts of the concept on the graph are included in a candidate concept set, the concepts are used as a candidate concept set to be combined, the probability of learning resources containing the combination is calculated according to a probability formula, and a learning resource candidate set is formed; the probability formula is:

Wherein, Representing learning resources to be selected in the i-th step, r _j representing j-th learning resources, KP _j representing a knowledge concept set corresponding to the j-th learning resources, and degree (c _k) representing the node degree of the c _k concept;

When learning resources are calculated, the probability of different learning resources is calculated by utilizing the node degree, and the selected learning resources are set to 0;

Then determining adjacent concepts according to the currently selected concepts, processing to obtain learning resource candidate sets corresponding to all concept combinations, and distributing according to the obtained learning resources Sampling learning resources in the candidate set to obtain learning resources r _i of a position i;

After repeating the above process, a final learning path lp= { r ₁,r₂,…,r_n }, and probabilities corresponding to each step of concepts are generated And probability of learning resources/>

Further, the step S5 specifically includes:

the probability of successfully mastering the ith concept on the path using a sigmoid function is:

Wherein f ^y is an MLP layer, sigmoid () is a Sigmoid function;

The DKT model is one of the most classical models for knowledge tracking and is therefore selected as one of the environments. The knowledge states at different moments are transmitted to the KT model, and the environment calculates loss according to the following formula according to the current knowledge state and transmits the loss into the intelligent agent; in addition, after the learning path is generated, a problem record is obtained, and the learning effect E _T of each time is also used as a part of loss and transmission of the learning effect E _T into the intelligent agent; finally obtaining trained Agent, namely Decoder model parameters containing T-GCN;

The loss function is designed as follows:

L＝L_θ+βL_y

Wherein L, L _θ、L_y and beta respectively represent the loss function, the loss function based on the learning resource recommended each time, and the loss function and occupied proportion based on the knowledge level at each moment.

The beneficial effects are that: compared with the prior art, the learning path recommending method based on the multi-concept learning resources solves the problem of learning path recommending under a multi-concept scene, and can recommend learning paths containing the multi-concept learning resources according with individual requirements for learners.

Drawings

FIG. 1 is a frame diagram of a learning path recommendation method provided by the invention;

FIG. 2 is a schematic diagram of the structural composition of a T-GCN model;

FIG. 3 is a block diagram of a GRU model;

FIG. 4 is a diagram showing the operation of the T-GCN model.

Detailed Description

The present application is further illustrated in the accompanying drawings and detailed description which are to be understood as being merely illustrative of the application and not limiting of its scope, and various modifications of the application, which are equivalent to those skilled in the art upon reading the application, will fall within the scope of the application as defined in the appended claims.

In order to solve the problem of learning path recommendation under a multi-concept scene in the prior art, the invention provides a learning path recommendation method based on multi-concept learning resources, which is used for recommending a learning path containing the multi-concept learning resources according with individual requirements for learners.

The following gives a question definition of a Multi-Concept learning path Recommendation (Multi-Concept LEARNING PATH Recommendation).

The learner's historical learning record set may be expressed as: h= { U, R, SCORE }, U represents a learner set, R represents a learning resource set, and SCORE represents a history learning SCORE set. For a certain student U epsilon U, the history learning record is expressed as H ^u＝{H₁,h₂,…,h_t, wherein t is the total number of the history learning records of the student U, H _k = { R, SCORE }, k epsilon [1, t ], wherein a certain learning resource R epsilon R, and the learned SCORE epsilon SCORE. Each learning resource corresponds to a knowledge concept set, i.eLearner u completes learning of n learning resources in a certain order, forming learning path lp= { r ₁,r₂,…,r_n }. The learning effect is expressed as:

Wherein E _e and E _b represent the target conceptual scores of the students before and after learning, respectively, and E _sup represents the upper bound of the mastery situation.

Definition 1: knowledge structure

The relation between the learning resources and the resources is represented by means of a conceptual diagram, and the conceptual diagram is defined as G= (C, RE), wherein C= { C ₁,c₂,c₃, … }, and each vertex corresponds to one knowledge concept. The relationship re= { RE ₁,re₂, … }, where RE represents some relationship (e.g., timing relationship and co-occurrence relationship). Each vertex has a respective feature vector v _i representing the embedded encoding of the concept c _i corresponding to the learning resource, and the degree information of the vertex represents the importance of the concept. This information is used in the calculation of the subsequent graph rolling network.

Definition 2: multi-concept knowledge tracking

Given a learner's past history learning record H ^u＝{h₁,h₂,…,h_t},h_k = { r, score }, k e1, t, where the learning resources all represent a problem. Each r corresponds to multiple vertices of the knowledge structure g= (C, RE), answer case score e {0,1} (i.e., 0 indicates that the learner gave the wrong answer and 1 indicates that the correct answer was given). The goal is to model the knowledge level y= { Y ₁,y₂,…,y_t }, on |c| concepts (i.e. all vertices) for the first t moments of the learner, and predict the probability that the learner correctly answers a new topic r _t+1, i.e. P (score _t+1＝1|r_t+1,H^u, G).

The following formally defines the learning path recommendation problem in the multi-concept scene:

Technical problem definition: multi-concept learning path recommendation

Given history learning record h= { U, R, SCORE }, target concept t= { c ₁,c₂,c₃, … }, and a learner U. The target selects n learning resources from R, and generates a learning path Lp so as to maximize the learning effect E _T.

Based on the above definition, the present invention provides a learning path recommendation framework, as shown in fig. 1, which includes four core parts: encoder module, decoder module, candidate set generation module and feedback module. First, encoder module is aimed at extracting co-occurrence and timing relationships between knowledge concepts and constructing a concept graph matrix to balance the expressions of these two types of relationships. The matrix is then fed into the self-attention layer to generate a learner's learning record matrix. The Decoder module then uses the learning record matrix to estimate the learner's initial knowledge level and simulate the learned knowledge level change process. Based on the current knowledge level, the module further recommends appropriate learning resources. In the candidate set generation module, a learning path is constructed by two random samplings. Finally, the feedback module provides training feedback for knowledge levels at different time points according to the effect of the learning path so as to optimize the selection of the learning path.

Based on a learning path recommendation framework, the invention provides a learning path recommendation method based on multi-concept learning resources, as shown in fig. 1, comprising the following steps:

S1: inputting the history learning record to Encoder module for processing to obtain learning record matrix:

to address the challenge of lack of multi-concept knowledge tracking supervision information, the node degree of the knowledge concept is to be represented as its duty ratio in learning resources, and therefore, building a concept graph structure using a history learning record is a primary task.

The knowledge concept relationship in the history learning record is divided into two types: the co-occurrence relationship and the timing relationship. Co-occurrence relationships are expressed as knowledge concepts that appear in the same learning resource, where a close correlation can be considered to exist between the knowledge concepts. For example: if the high-school mathematical topic a contains concepts such as integers and real numbers, the three concepts appear together in the topic, and at this time, edges of any two points between the three points should be established on the graph structure. The time sequence relationship is expressed as a precursor successor relationship existing between knowledge concepts contained in each adjacent learning resource. For example: the high-school mathematical topic B practiced after topic a includes the concept of rounding, and then integers and real numbers should have a predecessor relationship to rounding. We use all knowledge concepts as nodes and these two classes of relationships as edges to construct an N x N adjacency matrix Adj. And then, using one hot code to record learning resources corresponding to the knowledge concept, and taking the learning resources as an initial feature matrix Fe of N x 2M. Wherein N represents the number of knowledge concepts, M represents the number of topics represented by different codes, the first M represents which topics a knowledge concept corresponds to, the last M represents the response of a learner to the knowledge concept, namely, the last M represents that a bit 1 is used for making a question, 0 represents that a wrong question is made, and the initial values are all 0.

The acquisition mode of the learning record matrix in the step is as follows: firstly, preprocessing a history learning record through an embedding layer to obtain feature matrixes at different moments, and then performing dimension reduction operation on the feature matrixes at different moments by using a self-attention mechanism to obtain a learning record matrix.

The specific process of preprocessing the history learning record through the embedded layer is as follows:

Feⁱ＝p(Fe_i-1,Record_i-1),0<i<K

Wherein, record _i-1 represents the first i-1 row of Record matrix, the function p represents adding the i-1 row of Record matrix to the corresponding answer case row of Fe _i-1 matrix, for example, the i-1 row of Record matrix represents the answer title R3, R3 title corresponds to the j row of Fe matrix, 0< j < M, and the i-1 row is added to the M+j row of Fe matrix. Namely, the first M rows of the Fe matrix represent the corresponding information of the problem, and the last M rows represent the answer condition. Thus, the feature matrix at different moments is obtained:

FEs＝{Fe₀,…,Fe_K-1}

the specific process of performing dimension reduction operation on the feature matrix at different moments by using a self-attention mechanism to obtain a learning record matrix is as follows:

All matrixes in Fe are respectively processed as follows:

i.e. Fe _i is reduced to a vector of N x h, h can be adjusted according to the actual training process, representing the dimensions of the hidden layer. Finally constructing learning record matrix I.e. K (N (K) h) matrices stacked up of N (h) matrices, and the adjacency matrix Adj pre-processed in the preamble work is obtained.

S2: record matrix to be learnedAnd inputting the adjacency matrix Adj into a T-GCN model in a Decoder module, and acquiring a knowledge level:

On the basis of the learning record matrix and the adjacent matrix, the Decoder is responsible for calculating the knowledge level, simulating the change process of the knowledge level when a learner contacts different learning resources, and providing the knowledge concept probability distribution required by random sampling for the subsequent candidate set generation module;

The core model of the Decoder module is a knowledge tracking model, and the main function of the Decoder module is to simulate the knowledge level of a learner and predict the answering situation of the questions according to the knowledge level. How to use node degree information to represent the influence weight of knowledge concepts on learning resources is an important problem to be solved by the knowledge tracking task. Most of the current knowledge tracking models mainly pay attention to extracting time sequence information of knowledge concepts, and adopt RNN, LSTM or GRU networks as main components of the models, so that a good prediction effect is achieved. However, these networks cannot efficiently extract the impact weight of knowledge concepts on the answers to topics. The present invention considers that knowledge concepts exist in the graph structure of the knowledge graph, so modeling the knowledge level of the learner by using the graph structure is a natural way. For this purpose, the invention selects a deep learning technique capable of processing the graph structure data as a supplement.

The conventional Convolutional Neural Network (CNN) has an ability to extract local features in euclidean space, however, it cannot effectively process data having a complex topological structure such as a knowledge conceptual diagram. In recent years, researchers have generalized CNNs to graph roll-up networks (GCNs) that can handle any type of graph structure data, which has received a great deal of attention. As we have described in related research work, the graph roll-up network (GCN) model has been successful in a variety of application scenarios, such as document classification, unsupervised learning, and image classification. The GCN updates node characteristics with node degree information, effectively addressing the challenge of lack of supervisory information. Thus, the present invention selects a T-GCN model that combines graph convolutional neural networks (GCNs) and GRU networks for modeling the learner's knowledge level.

Based on the above principle, as shown in fig. 2, the T-GCN model in step S2 includes a GCN model and a GRU model, where the GCN model is used to model the knowledge level of the learner at a certain moment, the GRU model is used to analyze the change rule of the knowledge level of the learner when learning different learning resources, and the T-GCN model tightly combines the GCN and the GRU, and continuously updates the knowledge level of the learner according to the learning record of the learner.

The operation method of the GCN model comprises the following steps:

Wherein, Is a contiguous matrix with self-connection (N.times.N), I _N is a unitary matrix,/>Is a degree matrix (N x N),/>The addition of the degree matrix enables the learner's knowledge level to be updated with the weight without acquiring additional supervision information; ho ^l is the output of the layer, θ ^l contains the parameters of the layer, and σ (-) represents the sigmoid function of the nonlinear model;

As shown in fig. 3, the GRU model operates in the following manner: the GRU takes the hidden state at the time t-1 as the current knowledge level input, and obtains the knowledge level at the time t. The model discovers the change rule of the knowledge level while capturing the knowledge level at the current moment, and is beneficial to recommending proper learning resources subsequently.

As shown in fig. 4, in the operation process of the T-GCN model, the left side is a process of knowledge level update, and the right side shows a specific structure of the T-GCN cell, and the calculation process of the T-GCN model is as follows:

h_t＝u_t*h_t-1+(1-u_t)*c_t

S3: the softmax layer, which inputs knowledge levels into the Decoder module, outputs probability distributions that different concepts need to be learned:

The hidden layer state h _t output by the GRU is used as the knowledge level S to be input into a softmax layer in a Decoder module to output probability distribution that different concepts need to be learned, specifically:

The initial knowledge level S ₀ is expressed as:

S₀＝h₀

Wherein h ₀ represents an initial hidden layer state;

S4: inputting the probability distribution in the step S3 into a candidate concept generation layer in a candidate set generation module to generate a concept candidate set; inputting the concept candidate set into a candidate resource generation layer in a candidate set generation module to generate a learning resource candidate set; acquiring a learning path according to the learning resource candidate set;

the step S4 specifically comprises the following steps:

S5: the optimal learning path is selected by utilizing a pre-trained knowledge tracking module, and the method specifically comprises the following steps:

The Environment in reinforcement learning is used to give feedback to the agent for adjusting the training weight of the agent. The environment in this example is represented by a pre-trained KT model. The probability of successfully mastering the ith concept on the path by using a sigmoid function is as follows:

Wherein f ^y is an MLP layer, sigmoid () is a Sigmoid function;

The loss function is designed as follows:

L＝L_θ+βL_y

In order to verify the effectiveness and the practical effect of the method, the embodiment performs experimental verification, and the method is specifically as follows:

1. data set and simulator

In order to verify the effectiveness of the learning path recommendation method based on the T-GCN, the embodiment uses an educational disclosure data set ASSIST09 to carry out experiments, and designs a simulator based on a DKT knowledge tracking model for simulating the learning process of a learner. Some statistics are as follows:

ASSIST09 dataset:

The dataset consisted of primary mathematical problem records sampled in the evaluation system Massachusetts Comprehensive ASSESSMENT SYSTEM (MCAS) during the 2009-2010 period. The initial dataset contains interaction behaviors of 525535 students altogether. The latest updated version of the data set is selected in the embodiment, which totally comprises 346860 interaction behaviors provided by 4217 students, namely answering behaviors on questions, and totally comprises 26688 different questions and 123 knowledge concepts, wherein only two thirds of the questions (17751) are annotated with the knowledge concepts, and all the questions correspond to 4 knowledge concepts at most. As described herein, the initial dataset has a large number of duplicate records and blank records, so the latest version of the dataset is selected and further modified: i) And deleting the interaction records lacking knowledge concept annotation so that whether the actually recommended learning path is effective or not can be analyzed when the instance study is conducted later. ii) deleting users with a number of questions less than 3 so that the model processed learner is learning on the platform for a long time, not a learner that is discontinue one's studies soon. Specific statistical information is shown in table 1.

TABLE 1 data set statistics

DKT simulator:

In order to avoid that the model only generates learning paths existing in the history learning record, thereby generating a better learning path, the present embodiment refers to the methods of liu et al and chen et al, and constructs a learner simulator. And training the KT model on the data set, and inputting a learning record of a learner for the KT model to enable the KT model to output the probability of correctly answering the problem. After this training is completed, the evaluation of the generated learning path learning effect ET can be thereby completed.

2. Experimental setup

To fully test the performance of the present model, the following variables were designed: n represents the learning path length, 10, 20, 30 are taken in this embodiment; log_len initial learning record length, 10, 15, 20 are taken in this embodiment. During training, the learning rate was reduced from 1×10 ^-3 to 1×10 ^-5. The batch size is set to 128. The weight of the L2 regularization term is 4 x 10 ^-5. The discontinue one's studies ratio was set to 0.5. The dimension of the embedded vector is set to 64. All models were trained under the same hardware setup, using NVIDIAGeForce RTX 3090cards.

3. Method for comparing experimental results

1. Baselines comparison method

In order to verify the effectiveness and robustness of the model, the model of the present invention is compared with the following model method:

Random: n concepts are randomly chosen from C and randomly arranged.

Rule-based: and after each learning resource is respectively learned by the simulator, returning a learning effect, and then sequencing the learning resources from small to large according to the effect to generate a path.

MPC: and predicting the effect of a plurality of random search paths in each step by using model predictive control in the RL in combination with KT and making current actions.

DQN: classical reinforcement learning models. Here we pre-train a DKT model based on raw data, generating the required states in the DQN.

SRC: the state-of-art effect based on the ordered concept perception learning path recommendation model achieves the optimal effect by combining the reinforcement learning model framework and the attention mechanism.

The baselines methods are learning path recommendation methods under single concepts, and compared with the methods under single concept scenes, whether the learning path recommendation methods under multiple concepts can exceed the average learning effect under the single concepts is explored, so that the learning path recommendation method under the multiple concept scenes provided by the invention can further improve the performance of learners. The evaluation index of the experiment is learning effect E _T.

2. Comparing the experimental results

The model is applied to a multi-concept learning scene, and not only can the multi-concept learning resource be recommended, but also better learning effect can be ensured compared with a method in a single-concept scene. Therefore, the model of the present invention is compared with five models of SRC (single concept). Experimental results show that the historic learning records with different lengths can influence the improvement degree of the learning score, namely the learning effect; learning paths of different lengths may also affect the degree of improvement in learning performance. The model can achieve the best learning effect on the history learning records with different lengths and the learning paths with different lengths.

History learning record of different lengths: at n=20, the present embodiment sets the history log_len of the learner to 20, 15, 10. The experimental results show that: the model of the invention is superior to all comparison methods under each history learning record length, and achieves the state-of-art effect. According to the experimental result, the shorter the history learning record is, the greater the degree of improvement of the score is, the better the learning effect is, and the real learning scene rule is met. The Random method proves that under the condition of unordered learning, the learning score is not increased and reduced, and the learning score can be improved in effective time on the premise of reasonably planning the learning path. The result of the MuTCoG method when the history length log_len=20 exceeds the result of the log_len=10 of all methods, proves that the multi-concept learning path recommendation method has better learning effect, meets the expectations, namely, the learning resource providing the multi-concept can improve the capability of a learner for integrating concepts, and further improves the mastery degree of the learner on different concepts. The results of the comparative experiments on the DKT simulator for controlling log_len are shown in Table 2.

Table 2-comparative experimental results of control log_len on DKT simulator

Learning paths of different lengths: when log_len=10, the learning path n of the learner is set to 10, 20, 30. The experimental results show that: the model of the invention is superior to all comparison methods under each learning path length, and achieves the state-of-art effect. According to the experimental result, the longer the learning path is, the better the learning effect is, and the law of the real learning scene is met. After n=20, the effect improvement degree of the SRC and the method MuTCoG of the present invention is slowed down, which proves that the learning effect of the learner is improved quickly in the early stage, and when the knowledge mastering degree is higher, the improvement degree is slowed down, so as to conform to the learning rule. The method can achieve the effect of the baseline method when n=20 and can achieve better learning effect by using shorter learning time, and the provided learning path can meet the demands of learners. The comparative experimental results of control n on the DKT simulator are shown in table 3.

Table 3-comparative experiment results of control n on DKT simulator

/>

Claims

1. A learning path recommending method based on multi-concept learning resources is characterized by comprising the following steps:

acquiring a learning path according to the learning resource candidate set;

2. The learning path recommending method based on the multi-concept learning resource according to claim 1, wherein the learning record matrix in step S1 is obtained by: firstly, preprocessing a history learning record through an embedding layer to obtain feature matrixes at different moments, and then performing dimension reduction operation on the feature matrixes at different moments by using a self-attention mechanism to obtain a learning record matrix.

3. The learning path recommending method based on the multi-concept learning resource according to claim 2, wherein the specific process of preprocessing the history learning record through the embedding layer in step S1 is as follows:

Feⁱ＝p(Fe_i-1,Record_i-1),0<i<K

FEs＝{Fe₀,…,Fe_K-1}。

4. the learning path recommending method based on the multi-concept learning resource according to claim 3, wherein the specific process of performing the dimension reduction operation on the feature matrix at different moments by using the self-attention mechanism in the step S1 to obtain the learning record matrix is as follows:

All matrixes in Fe are respectively processed as follows:

Namely, fe _i is reduced in dimension into a vector of N x h, wherein h can be adjusted according to the actual training process to represent the dimension of the hidden layer; finally constructing learning record matrix I.e. K (N (K) h) matrices stacked up of N (h) matrices, and the adjacency matrix Adj pre-processed in the preamble work is obtained.

5. The learning path recommendation method based on multi-concept learning resources according to claim 1, wherein the T-GCN model in the step S2 includes a GCN model for modeling a knowledge level of a learner at a certain moment and a GRU model for analyzing a change rule of the knowledge level of the learner when learning different learning resources, and the T-GCN model tightly combines the GCN and the GRU to continuously update the knowledge level of the learner according to a learning record of the learner.

6. The learning path recommending method based on the multi-concept learning resource according to claim 5, wherein the operating method of the GCN model in step S2 is as follows:

Wherein, Is a contiguous matrix with self-connection (N.times.N), I _N is a unitary matrix,/>Is a degree matrix (N x N),The addition of the degree matrix enables the learner's knowledge level to be updated with the weight without acquiring additional supervision information; ho ^l is the output of the layer, θ ^l contains the parameters of the layer, and σ (-) represents the sigmoid function of the nonlinear model;

7. The learning path recommendation method based on the multi-concept learning resource according to claim 5, wherein the calculation process of the T-GCN model in step S2 is as follows:

h_t＝u_t*ht_-1+(1-u_t)*c_t

8. The learning path recommendation method based on the multi-concept learning resource according to claim 7, wherein in the step S3, the hidden layer state h _t of the GRU output is used as the knowledge level S to be input to the softmax layer in the Decoder module to output the probability distribution that different concepts need to be learned, specifically:

The initial knowledge level S ₀ is expressed as:

S₀＝h₀

Wherein h ₀ represents an initial hidden layer state;

wherein j=1, 2, …, N; Representing the characteristic representation of the j-th concept obtained by the encoder; e _T represents fusion of concept embedding in the target concept T; w ₁,W₁,W₂,W₃ and b are learnable weights or matrices;

Wherein, Representing the concept to be selected in step i, c _j representing the concept number j,/>Calculating indexes of concept distribution, wherein c is a whole concept set;

9. The learning path recommendation method based on the multi-concept learning resource according to claim 8, wherein the step S4 specifically includes:

10. The learning path recommendation method based on the multi-concept learning resource according to claim 1, wherein the step S5 specifically includes:

Wherein f ^y is an MLP layer, sigmoid () is a Sigmoid function;

The knowledge states at different moments are transmitted to the KT model, and the environment calculates loss according to the following formula according to the current knowledge state and transmits the loss into the intelligent agent; in addition, after the learning path is generated, a problem record is obtained, and the learning effect E _T of each time is also used as a part of loss and transmission of the learning effect E _T into the intelligent agent; finally obtaining trained Agent, namely Decoder model parameters containing T-GCN;

The loss function is designed as follows:

L＝L_θ+βL_y