CN116258056A

CN116258056A - Multi-modal knowledge level assessment and learning performance prediction method, system and medium

Info

Publication number: CN116258056A
Application number: CN202211566063.4A
Authority: CN
Inventors: 王志锋; 陆子欣; 左明章; 王继新; 董石; 罗恒; 田元; 闵秋莎; 夏丹
Original assignee: Central China Normal University
Current assignee: Central China Normal University
Priority date: 2022-12-07
Filing date: 2022-12-07
Publication date: 2023-06-13

Abstract

The invention belongs to the technical field of personalized chemistry, and discloses a multi-modal knowledge level assessment and learning performance prediction method, a system and a medium, wherein the method comprises the following steps: collecting learning behavior features and learning resource features of a learner, introducing a cyclic neural network and a convolutional neural network, respectively performing representation learning on text semantic features and image semantic features of test questions, and obtaining test question fusion semantic features through feature fusion; constructing a learning state tracking model based on gating, and predicting the increment of the commonality knowledge to be acquired by students; fusion of the commonality knowledge increment of the learner and the test question fusion semantic feature, prediction of response of the learner at the next moment, and training; and predicting the learning knowledge state of the learner by using the trained parameters. The prediction method and the prediction system provided by the invention not only realize more accurate response prediction of the learner, but also can predict the learning knowledge state of the learner, thereby assisting students to efficiently develop more targeted learning work.

Description

Multi-modal knowledge level assessment and learning performance prediction method, system and medium

Technical Field

The invention belongs to the technical field of personalized learning, and particularly relates to a multi-modal knowledge level assessment and learning performance prediction method, a system and a medium.

Background

With the gradual penetration of the informatization progress of education, an online learning mode which is promoted by the fusion of traditional education and information technology gradually goes into the field of view of the public. Unlike traditional teaching mode, the online education platform represented by the lesson, the scholars online and edX can provide multimedia high-quality teaching resources from all over the world, and students can browse various courses at any time conveniently. However, in the face of the growing complexity of learning resources, it is difficult for students to screen out resources meeting their own learning demands, which is contrary to the goal of improving learning efficiency and effect of students, which is expected for online education.

The goal of learning performance prediction is to build a model of the change of knowledge state with time according to the learning behavior of a learner in a certain learning task in the past so as to predict the learning performance of the learner in the next response. The learning performance prediction method can analyze the learning condition of the students by utilizing a large amount of learning data left by the students during online learning, provide a personalized learning scheme and meet the personalized learning requirement of the students. At present, the mainstream learning performance prediction methods are mainly divided into two types of traditional learning performance prediction methods and learning performance prediction methods based on deep learning according to different machine learning methods. The conventional learning performance prediction method is represented by a bayesian-based learning performance prediction method, predicts the next state according to a state transition matrix, predicts the response of the student according to the current state, however, the method uses a binary group to represent the knowledge grasping condition of the student, and has an inaccurate prediction effect in a data set of a large data amount. The learning performance prediction method based on deep learning is based on a deep algorithm, and the cyclic neural network is introduced to process related tasks, so that the accuracy and the like are greatly improved, but only knowledge point characteristics of test questions are considered, the modeling of a learning state is single, the interpretability is limited, and the stability and the accuracy of a prediction result are still to be improved.

The learning performance prediction method based on deep learning is based on a large amount of data, and the performance in terms of accuracy and the like is superior to that of the conventional learning performance prediction method, so the following discussion of the prior art is mainly to analyze the learning performance prediction method based on deep learning.

Through the above analysis, the problems and defects existing in the prior art are as follows:

(1) In the prior art, the learning performance prediction method based on deep learning only considers knowledge points of test questions as characteristics, and ignores the influence of text and image information of the test questions on the response of learners;

(2) The learning performance prediction method based on deep learning in the prior art is not accurate enough in modeling of learner knowledge mastering, and influences of specific concept test questions on the whole knowledge state of the learner are ignored.

Disclosure of Invention

Aiming at the problems existing in the prior art, the invention provides a multi-mode knowledge level assessment and learning performance prediction method, a system and a medium.

The invention is realized in such a way that a multi-modal knowledge level assessment and learning performance prediction method comprises:

collecting learning behavior features and learning resource features of a learner, introducing a cyclic neural network and a convolutional neural network to respectively perform deep representation learning on text semantic features and image semantic features of the test questions, and obtaining test question fusion semantic features after feature fusion; constructing a learning state tracking model based on gating to predict the increment of the common knowledge acquired by a learner; fusing the commonality knowledge increment and the test questions, fusing semantic features, predicting the response at the next moment, and constructing a loss function for training; and predicting the learning knowledge state by using the trained parameters.

Further, the learning behavior features and the learning resource features are acquired from a response sequence of a learner, the learning behavior features are response features containing time sequence information, the learning resource features comprise content features and knowledge point features of the learning resource, and the content features of the learning resource are divided into text content features and image content features of test questions;

the cyclic neural network performs deep learning representation on text content characteristics of the test questions, and the method comprises the following steps:

the pretreatment operations of Word segmentation and stop Word removal are carried out on the test question text sample, the test questions are processed by using a CBOW algorithm in a Word vector model Word2vec, and target words are predicted according to the context of the words, so that embedded representation of the test question Word level is obtained; the word-level semantic features of the test question text in the forward direction and the reverse direction are respectively obtained by adopting a bidirectional gating circulation unit network:

wherein a is ₁ ,a ₂ ,b ₁ ,b ₂ For the weight coefficient, tanh is the activation function,

and->

Respectively outputting a positive hidden layer and a reverse hidden layer at the moment t;

for a pair of

And->

Feature fusion is carried out to obtain the two-way semantic features of the test questions at word level:

mapping the bidirectional semantic features by using a maximum pooling operation to obtain text semantic features of test questions:

x _v ＝max(v ₁ ，v ₂ ，...，v _T )

wherein T is the number of words of the test question on the sentence level;

the convolutional neural network performs deep learning representation on the image content characteristics of the test questions, and comprises the following steps:

the method comprises the steps of adjusting the sizes of test question image samples, wherein the unified size is 128 x 128; constructing a convolutional neural network aiming at the test question image content, wherein the convolutional neural network comprises nine layers of neural networks, the first six layers are convolutional layers and pooling layers which are alternately arranged, the last three layers are all linear layers, the convolutional kernels of the three convolutional layers are 3 multiplied by 3, and the sizes of input feature images are 128, 64 and 32 respectively; the three pooling layers are all the largest pooling layers, a 2 multiplied by 2 filter is used, and the step length is 2; inputting the test question image sample into the constructed convolutional neural network to obtain the image semantic feature representation of the test questions.

Further, the test question fusion semantic feature expression is:

wherein x is _vt For the text semantic feature of the answer questions at the moment t, x _ct The image semantic features of the answer questions are made at the moment t;

synthesizing the test question fusion semantic features and the response features to obtain a comprehensive response vector of a learner, wherein the comprehensive response vector is expressed as:

wherein x is _t Is the vector of the test question fusion semantic features; r is (r) _t The acquired response characteristics of the learner, namely the actual response conditions; 0 is one and x _t All zero vectors of the same dimension.

Further, the specific process of constructing the learning state tracking model based on gating to predict the commonality knowledge increment acquired by the learner comprises the following steps:

inputting the comprehensive response vector of the learner into a gating circulation unit network, calculating the hidden state at the moment t, and further constructing a learning state tracking model based on gating to track the learning state matrix of the learner changing with time sequence:

Z _t ＝sigmoid(W _Z ·[H _t-1 ，c _t ]+b _Z )

R _t ＝sigmoid(W _R ·[H _t-1 ，c _t ]+b _R )

wherein Z is _t And R is R _t The gate values of the update gate and the reset gate are respectively represented by the hidden state H at the time t-1 _t-1 Combined answer vector c with time t _t Respectively performing linear transformation after splicing to obtain W _Z ,W _R W is a weight matrix, b _Z ,b _R B is the corresponding bias term, sigmoid and tanh are the activation functions, where

Is the candidate hidden state at the moment t, H _t The hidden state at the time t can be represented, and the learning state matrix at the time can also be represented;

the acquired knowledge point characteristics of the test questions are expressed as knowledge embedding vectors through linear layer embedding;

constructing a knowledge embedding static matrix storing information of each knowledge point, calculating correlation coefficients between the knowledge points corresponding to the test questions answered by the learner at the current moment and all knowledge points, and obtaining knowledge influence weights:

wherein, knowledge points i E (1, K), K is the number of all knowledge points, K _t Knowledge embedding vector representing time t, M _i Representing a knowledge embedding vector corresponding to an ith knowledge point in the knowledge embedding matrix M;

based on Markov properties, combining knowledge influence weights of knowledge points corresponding to the test questions at the next moment with a learning state matrix of students at the current moment, and predicting common knowledge increment to be acquired by the students at the next moment:

in the method, in the process of the invention,

for the learning state vector of the learner at the knowledge point i at the moment t, the learning state matrix H _t Consists of K learning state vectors.

Further, the fusion of the commonality knowledge increment and the test questions to the semantic features to obtain the question learning vector is expressed as follows:

after the fusion representation is subjected to linear transformation, the response result of the student at the next moment is predicted by activating a function:

in the method, in the process of the invention,

response results for learner predicted at time t+1, W ₁ ,W ₂ As a weight matrix, b ₁ ,b ₂ As corresponding bias items, relu and sigmoid are activating functions;

defining a loss function L according to the predicted response result of the learner and the actual response result of the learner:

finally, the Adam optimizer is used for learning and updating the model network parameters.

Further, the applying the trained parameters to predict the learning knowledge state includes:

learning knowledge state vectors for different concepts at different times

Splicing the knowledge variation vectors with 0 vectors of the same dimension to obtain knowledge variation vectors of the removed topic semantic information:

and taking the knowledge change vector as input, and outputting a prediction result of knowledge mastery degree through a linear layer and an activation layer:

g _t ×sigmoid(W ₂ ·relu(W ₁ ·v _t ′+b ₁ )+b ₂ )

in the formula g _t I.e. the predicted result of knowledge mastery degree, W ₁ ,W ₂ ，b ₁ ,b ₂ I.e. the corresponding parameters of the linear layer used in the prediction of the response.

It is another object of the present invention to provide a multi-modal knowledge level assessment and learning performance prediction system of the multi-modal knowledge level assessment and learning performance prediction method, the multi-modal knowledge level assessment and learning performance prediction system comprising:

the test question multi-mode fusion semantic modeling module is used for respectively carrying out deep learning representation of test question semantic features by using a cyclic neural network and a self-designed convolutional neural network according to the content of learning resources on a text channel and an image channel, and carrying out feature fusion to obtain multi-mode test question fusion semantic features;

the test question commonality knowledge modeling module is used for calculating association coefficients among knowledge points through embedding representation of the knowledge points of the learning resources so as to acquire knowledge influence weights;

the learner response prediction module is used for constructing a learning state vector of the learner and a comprehensive learning vector representation of a question to be answered by the learner according to learning behavior characteristics of the learner and learning resource characteristics to be answered based on a learning interaction sequence of the learner, and predicting a response result of the learner;

and the learner knowledge mastering and analyzing module is used for predicting mastering conditions of the learner on all knowledge points in the learning resources by combining the learning state vector of the learner by using the parameters of the response prediction module.

It is a further object of the present invention to provide a computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of the multimodal knowledge level assessment and learning expression prediction method.

It is a further object of the present invention to provide a computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the multimodal knowledge level assessment and learning expression prediction method.

Another object of the present invention is to provide an information data processing terminal for implementing the multi-modal knowledge level assessment and learning expression prediction system.

In combination with the technical scheme and the technical problems to be solved, the technical scheme to be protected has the following advantages and positive effects:

first, aiming at the technical problems in the prior art and the difficulty in solving the problems, the technical problems solved by the technical proposal of the invention are analyzed in detail and deeply by tightly combining the technical proposal to be protected, the results and data in the research and development process, and the like, and some technical effects brought after the problems are solved have creative technical effects. The specific description is as follows:

(1) In the prior art, the learning performance prediction method based on deep learning only considers knowledge points of test questions as features, and ignores the influence of text and image information of the test questions on the response of learners. The difficulty in solving the problem is how to effectively extract and fuse the multi-mode semantic features of the test questions and establish the mapping relation between the semantics of the test questions and the responses of learners. According to the invention, text information and image information of a test question interacted by a learner are fully utilized as input, a cyclic neural network and a convolutional neural network are respectively introduced, and feature extraction, representation and fusion are carried out on the text information and the image information to construct individual semantic features of the test question, so that learning resources are better represented, a model is used for carrying out subsequent more accurate prediction, and experimental results prove that the performance of the test question prediction method is better than that of other study performance prediction methods based on deep learning in the aspects of AUC, BCELoss and the like;

(2) The learning performance prediction method based on deep learning in the prior art is not accurate enough in modeling of learner knowledge mastering, and influences of specific concept test questions on the whole knowledge state of the learner are ignored. The difficulty in solving the problem is how to construct the knowledge state of the learner, and based on the relevance between knowledge points, the influence of the test questions on the learning of the knowledge state by the learner is fully considered. The invention digs the relation between the knowledge points of the learning resources, introduces the bidirectional cyclic neural network, carries out deep representation learning on the learning state matrix of the learner, and quantifies the learning harvest of the learner.

Secondly, the technical scheme is regarded as a whole or from the perspective of products, and the technical scheme to be protected has the following technical effects and advantages:

the multi-mode knowledge level assessment and learning expression prediction method provided by the invention is used for respectively carrying out deep representation learning and fusion on multi-mode semantic features of test questions, then introducing GRU (generalized knowledge increment) to dynamically track the learning state of students, obtaining the commonality knowledge increment of the learners by utilizing Markov properties, and combining the test question fusion semantic features to predict response and learning knowledge state of the learners.

Thirdly, as inventive supplementary evidence of the claims of the present invention, the following important aspects are also presented:

(1) The expected benefits and commercial values after the technical scheme of the invention is converted are as follows:

the multi-mode knowledge level assessment and learning expression prediction method provided by the invention utilizes richer information by introducing the topic multi-mode semantics, so that a learner can grasp an analysis result more accurately and effectively, the learning efficiency of the learner is improved, the application of the learning expression prediction method in the education field is expanded, and the learning expression prediction method has huge commercial value.

(2) The technical scheme of the invention fills the technical blank in the domestic and foreign industries:

the learning performance prediction method based on deep learning in the prior art lacks accuracy in modeling of learner knowledge mastering, and does not consider the influence of specific concept test questions on the learner knowledge state construction. The invention calculates the knowledge influence weight of the knowledge concept on the learner, and realizes more accurate prediction analysis on the knowledge state and knowledge mastering condition of the learner.

(3) Whether the technical scheme of the invention solves the technical problems that people want to solve all the time but fail to obtain success all the time is solved:

the deep learning-based learning performance prediction method in the prior art ignores the relation among different concept questions, fully considers the relation of knowledge points among different questions, and acts on the dynamic tracking of the knowledge state of a learner and the analysis of knowledge mastering conditions, so that the interpretation of a prediction result is improved, and students are helped to develop learning work more pertinently.

(4) The technical scheme of the invention overcomes the technical bias:

in the prior art, the study performance prediction method based on deep learning generally ignores the influence of text and image information of test questions on the response of learners, and the invention uses a cyclic neural network and a convolutional neural network to extract the characteristics of the text and the image of the test questions respectively, utilizes characteristic fusion to construct the individual semantic characteristics of the test questions, combines the response reaction of students to track the knowledge state change of the students, and accurately predicts the future response result.

Drawings

FIG. 1 is a flowchart of a multi-modal knowledge level assessment and learning performance prediction method provided by an embodiment of the present invention;

FIG. 2 is a schematic diagram of a multi-modal knowledge level assessment and learning performance prediction method provided by an embodiment of the present invention;

FIG. 3 is a schematic diagram of a multi-modal knowledge level assessment and learning performance prediction system according to an embodiment of the present invention:

in the diagram, 1, a test question multi-mode fusion semantic modeling module, 2, a test question commonality knowledge modeling module, 3, a learner response prediction module, and 4, a learner knowledge mastering analysis module;

FIG. 4 is a convolutional neural network structure diagram for a test question image provided by an embodiment of the present invention;

fig. 5 is a schematic diagram showing comparison of experimental results of AUC of a data set TIMSS2007 provided in an embodiment of the present invention;

fig. 6 is a comparative schematic diagram of experimental results of the data set TIMSS2007 in BCELoss provided in the embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

In order to fully understand how the invention may be embodied by those skilled in the art, this section is an illustrative embodiment in which the claims are presented for purposes of illustration.

As shown in fig. 1, the method for multi-modal knowledge level assessment and learning performance prediction provided by the embodiment of the invention includes the following steps:

s101, acquiring learning behavior features and learning resource features of a learner from a learner response sequence, introducing a cyclic neural network and a convolutional neural network, respectively performing deep representation learning on text semantic features and image semantic features of test questions, obtaining test question fusion semantic features through feature fusion, and obtaining a comprehensive response vector by combining the learning behavior features;

s102, introducing a cyclic neural network, constructing a learning state tracking model based on gating, and predicting a common knowledge increment to be acquired by students according to knowledge influence weights corresponding to learning state matrixes of learners and test questions to be answered;

s103, fusion of the commonality knowledge increment representing the learner and the test question fusion semantic feature, prediction of response of the learner at the next moment, and construction of a corresponding loss function for training;

s104, predicting the learning knowledge state of the learner by applying the trained parameters, and capturing and analyzing the knowledge grasping degree of the learner on the time-series change of each knowledge point.

The schematic diagram of the multi-modal knowledge level assessment and learning performance prediction method provided by the embodiment of the invention is shown in fig. 2.

As shown in fig. 3, the multi-modal knowledge level assessment and learning performance prediction system provided by the embodiment of the invention includes:

the test question multi-mode fusion semantic modeling module 1 is used for respectively carrying out deep learning representation of test question semantic features by using a cyclic neural network and a self-designed convolutional neural network according to the content of learning resources on a text channel and an image channel, and carrying out feature fusion to obtain multi-mode test question fusion semantic features;

the test question commonality knowledge modeling module 2 is used for calculating association coefficients among knowledge points through embedding representation of the knowledge points of the learning resources so as to acquire knowledge influence weights;

a learner response prediction module 3, configured to construct a learning state vector of the learner and a comprehensive learning vector representation of the learner to be a response question based on the learning interaction sequence of the learner according to the learning behavior feature of the learner and the learning resource feature to be a response, so as to dynamically predict a response result of the learner;

and the learner knowledge mastering and analyzing module 4 is used for predicting the mastering condition of the students on each knowledge point in the learning resources by combining the learning state vector of the learner by using the parameters of the response prediction module. The technical scheme of the invention is further described below in conjunction with symbol explanations.

The symbols involved in the examples of the present invention are shown in table 1.

Table 1 symbols according to the embodiment of the present invention

/>

The S101 provided by the embodiment of the present invention specifically includes:

(1.1) collecting learning behavior characteristics and learning resource characteristics of a learner from a learner response sequence:

in selecting a mathematical and scientific discipline capability test TIMSS of an IEA organization, in 2007, interactive data of four-grade English learner and mathematical discipline test questions are formed into a data set TIMSS2007, wherein the interactive number, the knowledge point number, the test question number and the learner number contained in the data set TIMSS2007 are shown in the following table 2;

table 2 data set TIMSS2007 related information

Related information	Data set TIMSS2007
		Number of interactions between learner and exercise	6334 strips
Knowledge point number	9 pieces of
		Number of questions	58
Number of learners	779 pieces

The learner's learning behavior characteristics and learning resource characteristics are selected from the dataset TIMSS2007, and the selected extrinsic learning behavior characteristics and resource characteristics are shown in table 3 below.

Table 3 selected features in dataset TIMSS2007

(1.2) according to the content characteristics of the test questions on the text channel, carrying out deep representation learning on the text semantic characteristics of the test questions:

the pretreatment operations of Word segmentation and stop Word removal are carried out on the test question text sample, the test questions are processed by using a CBOW algorithm in a Word vector model Word2vec, and target words are predicted according to the context of the words, so that embedded representation of the test question Word level is obtained;

the word-level semantic features of the test question text in the forward direction and the reverse direction are respectively obtained by adopting a bidirectional gating circulation unit network:

and->

feature fusion is carried out on the two to obtain the two-way semantic features of the test question at word level:

mapping the two-way semantic features of the test questions at word level by using a maximum pooling operation to obtain semantic representation of the test questions at sentence level, namely text semantic features of the test questions:

x _v ＝max(v ₁ ，v ₂ ，...，v _T )

t is the number of words of the test question on the sentence level.

(1.3) constructing a convolutional neural network model according to the content characteristics of the test questions on the image channel, and performing deep representation learning on the image semantic characteristics of the test questions:

the method comprises the steps of adjusting the sizes of test question image samples to 128 x 128;

the convolutional neural network structure aiming at the test question image content is designed, and the convolutional neural network built by the convolutional neural network structure adopts a 9-layer structure in total. The first 6 layers are convolution layers and pooling layers which are alternately arranged, and the last 3 layers are linear layers. As shown in fig. 4, a convolutional neural network structure according to an embodiment of the present invention is shown. The convolution kernels of Conv1 layer, conv2 layer and Conv3 layer are 3×3, the input feature image sizes are 128, 64 and 32 respectively, the Pool1 layer, the Pool2 layer and the Pool3 layer are the largest pooling layers, a 2×2 filter is used, and the step length is 2;

and taking the test question image as input, and obtaining the image semantic feature representation of the test question through a convolutional neural network.

(1.4) carrying out feature fusion on the obtained test question text semantic features and the image semantic features to obtain test question fusion semantic features simultaneously containing test question bimodal semantic information:

(1.5) the comprehensive test questions fuse semantic features and response features of the learner, so that comprehensive response vector representation of the learner can be obtained:

x _t the semantic feature vector is fused with the test questions; r is (r) _t The acquired response characteristics of the learner, namely the actual response conditions; 0 is one and x _t All zero vectors of the same dimension; the difference of the connection modes of the two reflects whether the response of the learner is correct or not, and the comprehensive response vector c is obtained by splicing _t Different effects of different response situations on the knowledge state of the student can be distinguished in the following operations.

The S102 provided by the embodiment of the present invention specifically includes:

(2.1) inputting the comprehensive response vector of the learner into a gating circulation unit network, namely calculating the hidden state at the moment t, and further constructing a learning state tracking model based on gating to track a learning state matrix of the learner changing with time sequence:

Z _t ＝sigmoid(W _Z ·[H _t-1 ，c _t ]+b _Z )

R _t ＝sigmoid(W _R ·[H _t-1 ，c _t ]+b _R )

Z _t and R is R _t The gate values of the update gate and the reset gate are respectively represented by the hidden state H at the time t-1 _t-1 Combined answer vector c with time t _t Respectively performing linear transformation after splicing to obtain W _Z ,W _R W is a weight matrix, b _Z ,b _R B is the corresponding bias term, sigmoid and tanh are the activation functions, where

(2.2) respectively embedding the knowledge points which are expressed as knowledge embedding vectors through a linear layer according to the acquired knowledge points of the test questions in the step S101;

(2.3) constructing a knowledge embedding static matrix storing each piece of knowledge point embedding information, calculating correlation coefficients between knowledge points corresponding to the test questions answered by the learner at the current moment and all knowledge points, and obtaining knowledge influence weights:

i is E (1, K), K is the number of all knowledge points. k (k) _t Knowledge embedding vector representing time t, M _i Representing a knowledge embedding vector corresponding to an ith knowledge point in the knowledge embedding matrix M;

and (2.4) based on Markov properties, combining knowledge influence weights of knowledge points corresponding to the test questions at the next moment with a learning state matrix of a student at the current moment, and predicting a common knowledge increment to be acquired by the student at the next moment:

for the learning state vector of the learner on the concept i at the moment t, the learning state matrix H _t I.e. consisting of K learning state vectors.

The S103 provided by the embodiment of the present invention specifically includes:

and (3.1) splicing the commonality knowledge increment of the learner at the next moment and the test question fusion semantic features to obtain a fusion representation of the question learning vector:

(3.2) the fusion representation is subjected to linear transformation and then the result of response of the student at the next moment is obtained through activating the function

And (3) predicting:

W ₁ ,W ₂ as a weight matrix, b ₁ ,b ₂ Is the corresponding bias term. The relu and sigmoid are activating functions;

(3.3) defining a loss function L according to the predicted response result of the learner and the response of the real learner:

r _t the response result is made for the true learner. The objective function being a score of the scoreA negative log-likelihood function;

(3.4) learning and updating the model network parameters by using an Adam optimizer.

The S104 provided by the embodiment of the present invention specifically includes:

(4.1) determining a learner to be detected, the learner having learning status vectors for different concepts at different times

Splicing the knowledge variation vectors with 0 vectors of the same dimension to obtain knowledge variation vectors of the subject semantic information;

(4.2) using the knowledge variation vector as input, and outputting a prediction result of knowledge mastery degree through a linear layer and an activation layer of the same parameters as the weight matrix and the bias term in the step (3.2):

g _t ＝sigmoid(W ₂ ·relu(W ₁ ·v _t ′+b ₁ )+b ₂ )

In order to prove the inventive and technical value of the technical solution of the present invention, this section is an application example on specific products or related technologies of the claim technical solution.

In the experiment of the invention, the main development environment comprises: windows 10,GTX 1080Ti,Pytorch1.6.0,Python3.7, model specific hyper-parameters settings are shown in Table 4 below.

TABLE 4 Experimental model Supermarameter setting

Super parameter	Numerical value
		batch_size
	1
		epoch	30
dropout	0.8
		learning_rate	0.001
knowledge_embedding_size	50
		text_embedding_size	200
img_embedding_size	200
		kernel_size	3

It should be noted that the embodiments of the present invention can be realized in hardware, software, or a combination of software and hardware. The hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or special purpose design hardware. Those of ordinary skill in the art will appreciate that the apparatus and methods described above may be implemented using computer executable instructions and/or embodied in processor control code, such as provided on a carrier medium such as a magnetic disk, CD or DVD-ROM, a programmable memory such as read only memory (firmware), or a data carrier such as an optical or electronic signal carrier. The device of the present invention and its modules may be implemented by hardware circuitry, such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, etc., or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., as well as software executed by various types of processors, or by a combination of the above hardware circuitry and software, such as firmware.

The embodiment of the invention has a great advantage in the research and development or use process, and has the following description in combination with data, charts and the like of the test process.

The multi-mode knowledge level assessment and learning performance prediction method is compared with a learning performance prediction method based on deep learning, and the area AUC under an index curve and the two-class cross entropy loss BCELoss are compared. The evaluation index AUC is an evaluation index for measuring the quality of the two classification models, and can still accurately describe the quality of the overall performance of the models on the premise of unbalance of positive and negative samples in the samples, wherein the AUC value is 0.5 and represents a randomly available score, and the closer the score is 1, the more accurate the prediction result is. The BCELoss value can measure the deviation between the measured value and the true value to a certain extent, and the smaller the value is, the more accurate the predicted result is.

The invention is compared with a learning performance prediction method based on deep learning. In order to realize the fairness of comparison, corresponding super parameters related to the same module in two models are set to be the same, the AUC and BCELoss comparison results of a multi-mode knowledge level evaluation and learning performance prediction method and a deep learning-based learning performance prediction method under a data set TIMSS2007 are shown in table 5, and a training process schematic diagram of the model on the data set TIMSS2007 is shown in fig. 5 and 6.

Table 5 comparison of experimental results of different methods

From the experimental results, it can be seen that: according to the multi-mode knowledge level assessment and learning performance prediction method provided by the invention, on a data set TIMSS2007, the AUC is improved by 15.2%, and the BCELoss is reduced by 0.028. The method and the device consider the characteristics of semantic information about the test questions, design a cyclic neural network and a convolutional neural network, and respectively perform deep representation learning and fusion on the characteristics so as to better characterize the test questions; furthermore, a bidirectional circulating neural network is introduced, a learning state tracking model based on gating is constructed to dynamically track the learning state of students, and the interactive information of learners and test questions is fully utilized; finally, the common knowledge increment of the learner is obtained by utilizing the Markov property, the response reaction and the learning knowledge state of the learner are predicted by combining the semantic features of the test questions, the model structure is perfected, the precision of the model in the response reaction of the learner is also improved, and the result is superior to the learning performance prediction method based on deep learning. Experiments show that in terms of AUC and BCELoss, the multi-mode knowledge level assessment and learning performance prediction method provided by the invention is more effective than a learning performance prediction method based on deep learning, and in a word, the invention has the best experimental effect.

In conclusion, the multi-mode knowledge level assessment and learning performance prediction method and system provided by the invention not only realize more accurate response prediction of the learner, but also predict the learning knowledge state of the learner, thereby assisting students to efficiently develop more targeted learning work. According to the method, learning behavior features and learning resource features of a learner are collected from a learner response sequence, a cyclic neural network and a convolutional neural network are introduced, deep representation learning is respectively carried out on text semantic features and image semantic features of test questions, test question fusion semantic features are obtained through feature fusion, and comprehensive response vectors are obtained by combining the learning behavior features. And then introducing a cyclic neural network, constructing a learning state tracking model based on gating, and predicting the increment of common knowledge to be acquired by students according to the knowledge influence weight corresponding to the learning state matrix of the learner and the test question to be answered. And fusing the commonality knowledge increment of the learner with the test questions, fusing semantic features, predicting the response of the learner at the next moment, and constructing a corresponding loss function for training. Finally, the trained parameters are used for predicting the learning knowledge state of the learner, and the knowledge grasping degree of the learner on the time-series change of each knowledge point is captured and analyzed.

The foregoing is merely illustrative of specific embodiments of the present invention, and the scope of the invention is not limited thereto, but any modifications, equivalents, improvements and alternatives falling within the spirit and principles of the present invention will be apparent to those skilled in the art within the scope of the present invention.

Claims

1. A multi-modal knowledge level assessment and learning performance prediction method, characterized in that the multi-modal knowledge level assessment and learning performance prediction method comprises:

2. The multi-modal knowledge level assessment and learning performance prediction method according to claim 1, wherein the learning behavior features and learning resource features are acquired from a learner response sequence, the learning behavior features are response features including time sequence information, the learning resource features include content features and knowledge point features of the learning resource, and the content features of the learning resource are divided into text content features and image content features of the test questions;

and->

for a pair of

And->

x _v ＝max(v ₁ ，v ₂ ，...，v _T )

wherein T is the number of words of the test question on the sentence level;

3. The multi-modal knowledge level assessment and learning performance prediction method of claim 1, wherein the test question fusion semantic feature expression is:

4. The method for multi-modal knowledge level assessment and learning performance prediction according to claim 1, wherein the specific process of constructing a gating-based learning state tracking model for predicting the gain of commonality knowledge acquired by a learner comprises:

Z _t ＝sigmoid(W _Z ·[H _t-1 ，c _t ]+b _Z )

R _t ＝sigmoid(W _R ·[H _t-1 ，c _t ]+b _R )

in the method, in the process of the invention,

5. The method for multi-modal knowledge level assessment and learning performance prediction according to claim 1, wherein the fusion of the semantic features of the fusion of the commonality knowledge increment and the test questions to obtain the fusion representation of the question learning vector is as follows:

/>

in the method, in the process of the invention,

6. The method for multimodal knowledge level assessment and learning performance prediction as claimed in claim 1, wherein the applying trained parameters to predict learning knowledge state comprises:

splicing learning knowledge state vectors of different concepts at different moments with 0 vectors of the same dimension to obtain knowledge change vectors of the removed topic semantic information:

g _t ＝sigmoid(W ₂ ·relu(W ₁ ·v _t ′+b ₁ )+b ₂ )

7. A multi-modal knowledge level assessment and learning performance prediction system applying the multi-modal knowledge level assessment and learning performance prediction method of any one of claims 1-6, wherein the multi-modal knowledge level assessment and learning performance prediction system comprises:

8. A computer device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of the multimodal knowledge level assessment and learning expression prediction method of any of claims 1-6.

9. A computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the small multimodal knowledge level assessment and learning expression prediction method of any of claims 1-6.

10. An information data processing terminal for implementing the multimodal knowledge level assessment and learning performance prediction system of claim 7.