CN114266340A - Knowledge query network model introducing self-attention mechanism - Google Patents

Knowledge query network model introducing self-attention mechanism Download PDF

Info

Publication number
CN114266340A
CN114266340A CN202111560167.XA CN202111560167A CN114266340A CN 114266340 A CN114266340 A CN 114266340A CN 202111560167 A CN202111560167 A CN 202111560167A CN 114266340 A CN114266340 A CN 114266340A
Authority
CN
China
Prior art keywords
knowledge
model
self
student
attention mechanism
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111560167.XA
Other languages
Chinese (zh)
Inventor
程艳
吴刚
陈豪迈
项国雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangxi Normal University
Original Assignee
Jiangxi Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangxi Normal University filed Critical Jiangxi Normal University
Priority to CN202111560167.XA priority Critical patent/CN114266340A/en
Publication of CN114266340A publication Critical patent/CN114266340A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Aiming at the defects of the knowledge query network and the self-attention knowledge tracking model, the knowledge query network model introducing the self-attention mechanism is designed. The invention aims to provide respective advantages of a knowledge query network and a self-attention mechanism, and the self-attention mechanism is introduced into the knowledge query network, so that the capability of a model modeling sequence is maintained, and meanwhile, different positions of a single sequence can be enhanced to be associated to calculate the representation of the sequence, so that more accurate internal key characteristics of a student history question making record can be obtained. And a regularization term corresponding to reconstruction errors is added into the model loss function to enhance the consistency of model prediction and further solve the existing reconstruction errors.

Description

Knowledge query network model introducing self-attention mechanism
Technical Field
The invention belongs to the field of intelligent education and is applied to a knowledge tracking task.
Background
First, noun interpretation: 1. knowledge Tracking (KT): and modeling according to the answer records of the learner to obtain the knowledge mastering state of the learner and predict the answer probability of the learner in the next question.
2. Knowledge Queries (KQN): the researchers in 2019 propose a knowledge query network to solve the knowledge tracking task, the knowledge query network encodes the historical interaction sequence of the current time step of the student and the KC contained in the next time step topic into a knowledge state vector and a skill vector with the same dimension by using a neural network, and then the interaction between the knowledge state of the student and the KC is defined by using the dot product between the two vectors.
3. Long short-term memory network (LSTM): the LSTM model is an improved model proposed by researchers for the RNN gradient disappearance and gradient explosion problems. A 'gate' is added on an original RNN model to control information transmission, so that the problems of gradient disappearance and explosion can be avoided to a certain extent, and long-distance dependence information of a sequence is acquired.
4. Self-attention Mechanism (self-attention Mechanism): derived from studies on human vision. In cognitive science, due to the bottleneck of information processing, human beings selectively pay attention to a part of all information while ignoring other visible information; later on, this idea was applied to image processing and natural language processing, and a good result was obtained, and recently, a self-attention mechanism was introduced to the task of knowledge tracking, which aims to better focus on the learning history sequence that is more important for prediction.
5. Reconstruction error (reconstruction error): one of the major problems of the knowledge query network model is the reconstruction error, that is, when a student correctly answers a question containing a certain skill, the prediction probability of the model for whether the student can answer the question containing the skill at the current time step is reduced, and vice versa.
6. Knowledge Component (KC): KC can be understood broadly as a point of knowledge, a concept of knowledge, a principle, a fact, or a skill.
Secondly, the prior art: 1.(1) Bayesian Knowledge Tracking (BKT) method: the BKT model models the knowledge state of the student into a group of potential binary variables, and meanwhile, the potential variable of the knowledge state of the student is updated by using a Hidden Markov Model (HMM) according to observable variables such as wrong conditions of the student for answering the question. Although BKT and its extended model have been highly successful in the KT domain, they still have significant problems in their own right. First, the knowledge state of a learner expressed as a set of binary variables does not conform to the learning process in the real world; second, the way BKT models separately for each KC makes it impossible to capture the relationships between different KCs, nor to model undefined KCs. (2) Depth knowledge tracking model (DKT): in 2015, the DKT model introduced a deep neural network into the knowledge tracking task for the first time, which utilized LSTM to model student sequences and achieved good results, but its interpretability was always questionable. (3) Knowledge Query Network (KQN) model: the historical interaction sequence of the current time step of the student and the KC contained in the next time step topic are coded into a knowledge state vector and a skill vector with the same dimension by using a neural network, and then the interaction between the knowledge state of the student and the KC is defined by using the dot product between the two vectors, and the reconstruction error exists in the same way as in DKT.
2. Self-attention knowledge tracking (SAKT) model: a Transformer structure is used in the KT field to replace an RNN used by an original DKT model, so that the long-term dependence problem of the RNN is solved, the model prediction performance is greatly improved, and the sequence modeling capability of the RNN is lost in the SAKT model.
Thirdly, the technical problem is as follows: 1. although the knowledge query network improves interpretability in student and KC interaction to some extent, the prediction performance is not as good as self-attention knowledge tracking because the long-term dependence problem of LSTM limits the performance of the knowledge query network, and the knowledge query network has reconstruction errors as same as DKT. 2. Self-attention knowledge tracking uses more advanced Transformer structures, but also loses the RNN's ability to model sequences, and the student's learning is continuous, so the sequence modeling ability of the model is not negligible.
Disclosure of Invention
1. Aiming at the defects of a knowledge query network and a self-attention knowledge tracking model, the invention aims to integrate the advantages of the knowledge query network and the self-attention knowledge tracking model, introduce a self-attention mechanism into the knowledge query network model, obtain more accurate internal key characteristics of a student historical interaction sequence, simultaneously keep the cyclic modeling capacity of a long-term and short-term memory network, introduce a regularization item into a loss function, enhance the consistency of model prediction and solve the reconstruction error of the knowledge query network;
2. the technical innovation points of the invention are as follows: (1) the deep knowledge tracking model of the knowledge inquiry network with the self-attention mechanism is provided, the position information provided by the long-term and short-term memory network is utilized to model the front-back relation of the historical interaction sequence of the students, the modeling capability of the model sequence is reserved, meanwhile, the self-attention mechanism is used for associating different positions of a single sequence to calculate the representation of the sequence to obtain more accurate internal key characteristics of the historical problem making record of the students, and the advantages of the two are fused to improve the prediction performance; (2) introducing a regularization term corresponding to a reconstruction problem into a loss function of the model to enhance the consistency of prediction so as to solve KQN reconstruction errors existing in the model;
drawings
FIG. 1 is a diagram of a knowledge query network architecture incorporating a self-attention mechanism.
Detailed Description
The attached drawing of the specification is a model structure diagram of the invention at the time t, and the model consists of three parts: knowledge state coder (knowledge state encoder), skill coder (kill encoder), and knowledge state query (knowledge state query). At time t, the knowledge state encoder will be sending historical interaction tuples from students xtFirstly inputting the data into the LSTM layer to obtain the hidden state htThen handle htInput to the attention layer results intFinally a is addedtKnowledge state vector KS transformed into d dimensiont(ii) a And the skill encoder will include the skill q in the next time step t +1t+1Embedding into a skill vector S, also d-dimensional, by means of a Multilayer Perceptron (MLP)t+1In (1). Then the two vectors are transmitted to a knowledge state query component, the knowledge query component describes the interaction between the knowledge state of the student and the KC contained in the question in a dot product mode of the two vectors, and finally the dot product result is processed by sigmoThe id function obtains a probabilistic prediction of whether a student at the current time step can correctly answer the topic at the next time step.

Claims (5)

1. A deep knowledge tracking model of a knowledge query network introducing a self-attention mechanism is characterized in that: the method comprises the steps of firstly, utilizing position information provided by a long-term and short-term memory network in a knowledge state encoder to model the context of student interaction sequences, then associating different positions of a single sequence through an attention mechanism to calculate the representation of the sequence to obtain more accurate internal key features of student history problem making records, and then encoding the results obtained from an attention layer into knowledge state vectors. And finally, the model carries out vector dot product on the skill vector obtained by the multi-layer perceptron of the skill encoder and the knowledge state vector obtained by the knowledge encoder to simulate the interaction between the knowledge state and the knowledge point and inputs the interaction into a sigmoid function to obtain the correct answer probability of the next question of the student. Regularization terms corresponding to reconstruction problems are introduced into the loss functions to enhance the consistency of model prediction and further solve reconstruction errors.
2. The knowledge query network model for introducing a self-attention mechanism as claimed in claim 1, wherein: the knowledge state encoder keeps the capability of memorizing the network modeling sequence in a long and short term, and meanwhile, automatically focuses on the problem making record with larger influence on the prediction result in the student history interaction sequence by using a self-attention mechanism, and extracts more accurate relevant characteristics of the student knowledge state.
3. The knowledge query network model for introducing a self-attention mechanism as claimed in claim 1, wherein: the addition of regularization terms to the corresponding reconstruction problem normalizes the original model by taking into account the loss of interaction between the prediction and the current student knowledge state and skill.
4. The knowledge query network model for introducing a self-attention mechanism as claimed in claim 1, wherein: the input student interaction sequence of the student knowledge state encoder and the input skills of the skills encoder are encoded into one-hot encoded vectors.
5. The knowledge query network model for introducing a self-attention mechanism as claimed in claim 1, wherein: the dot product effect of the output student knowledge state vector of the student knowledge state encoder and the output skill vector of the skill encoder is in accordance with the condition that students answer questions in the real world based on self knowledge states and questions.
CN202111560167.XA 2021-12-20 2021-12-20 Knowledge query network model introducing self-attention mechanism Pending CN114266340A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111560167.XA CN114266340A (en) 2021-12-20 2021-12-20 Knowledge query network model introducing self-attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111560167.XA CN114266340A (en) 2021-12-20 2021-12-20 Knowledge query network model introducing self-attention mechanism

Publications (1)

Publication Number Publication Date
CN114266340A true CN114266340A (en) 2022-04-01

Family

ID=80827932

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111560167.XA Pending CN114266340A (en) 2021-12-20 2021-12-20 Knowledge query network model introducing self-attention mechanism

Country Status (1)

Country Link
CN (1) CN114266340A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116341990A (en) * 2023-05-29 2023-06-27 中交第四航务工程勘察设计院有限公司 Knowledge management evaluation method and system for infrastructure engineering

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116341990A (en) * 2023-05-29 2023-06-27 中交第四航务工程勘察设计院有限公司 Knowledge management evaluation method and system for infrastructure engineering
CN116341990B (en) * 2023-05-29 2023-08-04 中交第四航务工程勘察设计院有限公司 Knowledge management evaluation method and system for infrastructure engineering

Similar Documents

Publication Publication Date Title
CN110377710B (en) Visual question-answer fusion enhancement method based on multi-mode fusion
CN110428010B (en) Knowledge tracking method
CN111897941B (en) Dialogue generation method, network training method, device, storage medium and equipment
CN109766427B (en) Intelligent question-answering method based on collaborative attention for virtual learning environment
CN112541063B (en) Man-machine conversation method and system based on self-learning conversation model
CN110851760B (en) Human-computer interaction system for integrating visual question answering in web3D environment
CN109597876A (en) A kind of more wheels dialogue answer preference pattern and its method based on intensified learning
CN110175228A (en) Based on basic module and the loop embedding of machine learning dialogue training method and system
CN114398976A (en) Machine reading understanding method based on BERT and gate control type attention enhancement network
CN113610235A (en) Adaptive learning support device and method based on deep knowledge tracking
CN111563146A (en) Inference-based difficulty controllable problem generation method
CN115545160B (en) Knowledge tracking method and system for multi-learning behavior collaboration
CN117218498B (en) Multi-modal large language model training method and system based on multi-modal encoder
CN112800323A (en) Intelligent teaching system based on deep learning
CN112990464A (en) Knowledge tracking method and system
CN113239209A (en) Knowledge graph personalized learning path recommendation method based on RankNet-transformer
CN113360618A (en) Intelligent robot dialogue method and system based on offline reinforcement learning
CN116136870A (en) Intelligent social conversation method and conversation system based on enhanced entity representation
CN114266340A (en) Knowledge query network model introducing self-attention mechanism
Wu et al. Muscle Vectors as Temporally Dense" Labels"
Lin et al. Fuzzy Sets Theory Preliminary
CN114117033B (en) Knowledge tracking method and system
Ma et al. Dtkt: An improved deep temporal convolutional network for knowledge tracing
CN116611442A (en) Interest point recommendation method based on deep semantic extraction
CN112256858B (en) Double-convolution knowledge tracking method and system fusing question mode and answer result

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination