CN114266340A - Knowledge query network model introducing self-attention mechanism - Google Patents
Knowledge query network model introducing self-attention mechanism Download PDFInfo
- Publication number
- CN114266340A CN114266340A CN202111560167.XA CN202111560167A CN114266340A CN 114266340 A CN114266340 A CN 114266340A CN 202111560167 A CN202111560167 A CN 202111560167A CN 114266340 A CN114266340 A CN 114266340A
- Authority
- CN
- China
- Prior art keywords
- knowledge
- model
- self
- student
- attention mechanism
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Aiming at the defects of the knowledge query network and the self-attention knowledge tracking model, the knowledge query network model introducing the self-attention mechanism is designed. The invention aims to provide respective advantages of a knowledge query network and a self-attention mechanism, and the self-attention mechanism is introduced into the knowledge query network, so that the capability of a model modeling sequence is maintained, and meanwhile, different positions of a single sequence can be enhanced to be associated to calculate the representation of the sequence, so that more accurate internal key characteristics of a student history question making record can be obtained. And a regularization term corresponding to reconstruction errors is added into the model loss function to enhance the consistency of model prediction and further solve the existing reconstruction errors.
Description
Technical Field
The invention belongs to the field of intelligent education and is applied to a knowledge tracking task.
Background
First, noun interpretation: 1. knowledge Tracking (KT): and modeling according to the answer records of the learner to obtain the knowledge mastering state of the learner and predict the answer probability of the learner in the next question.
2. Knowledge Queries (KQN): the researchers in 2019 propose a knowledge query network to solve the knowledge tracking task, the knowledge query network encodes the historical interaction sequence of the current time step of the student and the KC contained in the next time step topic into a knowledge state vector and a skill vector with the same dimension by using a neural network, and then the interaction between the knowledge state of the student and the KC is defined by using the dot product between the two vectors.
3. Long short-term memory network (LSTM): the LSTM model is an improved model proposed by researchers for the RNN gradient disappearance and gradient explosion problems. A 'gate' is added on an original RNN model to control information transmission, so that the problems of gradient disappearance and explosion can be avoided to a certain extent, and long-distance dependence information of a sequence is acquired.
4. Self-attention Mechanism (self-attention Mechanism): derived from studies on human vision. In cognitive science, due to the bottleneck of information processing, human beings selectively pay attention to a part of all information while ignoring other visible information; later on, this idea was applied to image processing and natural language processing, and a good result was obtained, and recently, a self-attention mechanism was introduced to the task of knowledge tracking, which aims to better focus on the learning history sequence that is more important for prediction.
5. Reconstruction error (reconstruction error): one of the major problems of the knowledge query network model is the reconstruction error, that is, when a student correctly answers a question containing a certain skill, the prediction probability of the model for whether the student can answer the question containing the skill at the current time step is reduced, and vice versa.
6. Knowledge Component (KC): KC can be understood broadly as a point of knowledge, a concept of knowledge, a principle, a fact, or a skill.
Secondly, the prior art: 1.(1) Bayesian Knowledge Tracking (BKT) method: the BKT model models the knowledge state of the student into a group of potential binary variables, and meanwhile, the potential variable of the knowledge state of the student is updated by using a Hidden Markov Model (HMM) according to observable variables such as wrong conditions of the student for answering the question. Although BKT and its extended model have been highly successful in the KT domain, they still have significant problems in their own right. First, the knowledge state of a learner expressed as a set of binary variables does not conform to the learning process in the real world; second, the way BKT models separately for each KC makes it impossible to capture the relationships between different KCs, nor to model undefined KCs. (2) Depth knowledge tracking model (DKT): in 2015, the DKT model introduced a deep neural network into the knowledge tracking task for the first time, which utilized LSTM to model student sequences and achieved good results, but its interpretability was always questionable. (3) Knowledge Query Network (KQN) model: the historical interaction sequence of the current time step of the student and the KC contained in the next time step topic are coded into a knowledge state vector and a skill vector with the same dimension by using a neural network, and then the interaction between the knowledge state of the student and the KC is defined by using the dot product between the two vectors, and the reconstruction error exists in the same way as in DKT.
2. Self-attention knowledge tracking (SAKT) model: a Transformer structure is used in the KT field to replace an RNN used by an original DKT model, so that the long-term dependence problem of the RNN is solved, the model prediction performance is greatly improved, and the sequence modeling capability of the RNN is lost in the SAKT model.
Thirdly, the technical problem is as follows: 1. although the knowledge query network improves interpretability in student and KC interaction to some extent, the prediction performance is not as good as self-attention knowledge tracking because the long-term dependence problem of LSTM limits the performance of the knowledge query network, and the knowledge query network has reconstruction errors as same as DKT. 2. Self-attention knowledge tracking uses more advanced Transformer structures, but also loses the RNN's ability to model sequences, and the student's learning is continuous, so the sequence modeling ability of the model is not negligible.
Disclosure of Invention
1. Aiming at the defects of a knowledge query network and a self-attention knowledge tracking model, the invention aims to integrate the advantages of the knowledge query network and the self-attention knowledge tracking model, introduce a self-attention mechanism into the knowledge query network model, obtain more accurate internal key characteristics of a student historical interaction sequence, simultaneously keep the cyclic modeling capacity of a long-term and short-term memory network, introduce a regularization item into a loss function, enhance the consistency of model prediction and solve the reconstruction error of the knowledge query network;
2. the technical innovation points of the invention are as follows: (1) the deep knowledge tracking model of the knowledge inquiry network with the self-attention mechanism is provided, the position information provided by the long-term and short-term memory network is utilized to model the front-back relation of the historical interaction sequence of the students, the modeling capability of the model sequence is reserved, meanwhile, the self-attention mechanism is used for associating different positions of a single sequence to calculate the representation of the sequence to obtain more accurate internal key characteristics of the historical problem making record of the students, and the advantages of the two are fused to improve the prediction performance; (2) introducing a regularization term corresponding to a reconstruction problem into a loss function of the model to enhance the consistency of prediction so as to solve KQN reconstruction errors existing in the model;
drawings
FIG. 1 is a diagram of a knowledge query network architecture incorporating a self-attention mechanism.
Detailed Description
The attached drawing of the specification is a model structure diagram of the invention at the time t, and the model consists of three parts: knowledge state coder (knowledge state encoder), skill coder (kill encoder), and knowledge state query (knowledge state query). At time t, the knowledge state encoder will be sending historical interaction tuples from students xtFirstly inputting the data into the LSTM layer to obtain the hidden state htThen handle htInput to the attention layer results intFinally a is addedtKnowledge state vector KS transformed into d dimensiont(ii) a And the skill encoder will include the skill q in the next time step t +1t+1Embedding into a skill vector S, also d-dimensional, by means of a Multilayer Perceptron (MLP)t+1In (1). Then the two vectors are transmitted to a knowledge state query component, the knowledge query component describes the interaction between the knowledge state of the student and the KC contained in the question in a dot product mode of the two vectors, and finally the dot product result is processed by sigmoThe id function obtains a probabilistic prediction of whether a student at the current time step can correctly answer the topic at the next time step.
Claims (5)
1. A deep knowledge tracking model of a knowledge query network introducing a self-attention mechanism is characterized in that: the method comprises the steps of firstly, utilizing position information provided by a long-term and short-term memory network in a knowledge state encoder to model the context of student interaction sequences, then associating different positions of a single sequence through an attention mechanism to calculate the representation of the sequence to obtain more accurate internal key features of student history problem making records, and then encoding the results obtained from an attention layer into knowledge state vectors. And finally, the model carries out vector dot product on the skill vector obtained by the multi-layer perceptron of the skill encoder and the knowledge state vector obtained by the knowledge encoder to simulate the interaction between the knowledge state and the knowledge point and inputs the interaction into a sigmoid function to obtain the correct answer probability of the next question of the student. Regularization terms corresponding to reconstruction problems are introduced into the loss functions to enhance the consistency of model prediction and further solve reconstruction errors.
2. The knowledge query network model for introducing a self-attention mechanism as claimed in claim 1, wherein: the knowledge state encoder keeps the capability of memorizing the network modeling sequence in a long and short term, and meanwhile, automatically focuses on the problem making record with larger influence on the prediction result in the student history interaction sequence by using a self-attention mechanism, and extracts more accurate relevant characteristics of the student knowledge state.
3. The knowledge query network model for introducing a self-attention mechanism as claimed in claim 1, wherein: the addition of regularization terms to the corresponding reconstruction problem normalizes the original model by taking into account the loss of interaction between the prediction and the current student knowledge state and skill.
4. The knowledge query network model for introducing a self-attention mechanism as claimed in claim 1, wherein: the input student interaction sequence of the student knowledge state encoder and the input skills of the skills encoder are encoded into one-hot encoded vectors.
5. The knowledge query network model for introducing a self-attention mechanism as claimed in claim 1, wherein: the dot product effect of the output student knowledge state vector of the student knowledge state encoder and the output skill vector of the skill encoder is in accordance with the condition that students answer questions in the real world based on self knowledge states and questions.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111560167.XA CN114266340A (en) | 2021-12-20 | 2021-12-20 | Knowledge query network model introducing self-attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111560167.XA CN114266340A (en) | 2021-12-20 | 2021-12-20 | Knowledge query network model introducing self-attention mechanism |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114266340A true CN114266340A (en) | 2022-04-01 |
Family
ID=80827932
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111560167.XA Pending CN114266340A (en) | 2021-12-20 | 2021-12-20 | Knowledge query network model introducing self-attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114266340A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116341990A (en) * | 2023-05-29 | 2023-06-27 | 中交第四航务工程勘察设计院有限公司 | Knowledge management evaluation method and system for infrastructure engineering |
-
2021
- 2021-12-20 CN CN202111560167.XA patent/CN114266340A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116341990A (en) * | 2023-05-29 | 2023-06-27 | 中交第四航务工程勘察设计院有限公司 | Knowledge management evaluation method and system for infrastructure engineering |
CN116341990B (en) * | 2023-05-29 | 2023-08-04 | 中交第四航务工程勘察设计院有限公司 | Knowledge management evaluation method and system for infrastructure engineering |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110428010B (en) | Knowledge tracking method | |
CN110163299B (en) | Visual question-answering method based on bottom-up attention mechanism and memory network | |
CN109766427B (en) | Intelligent question-answering method based on collaborative attention for virtual learning environment | |
CN112541063B (en) | Man-machine conversation method and system based on self-learning conversation model | |
CN110851760B (en) | Human-computer interaction system for integrating visual question answering in web3D environment | |
CN117218498B (en) | Multi-modal large language model training method and system based on multi-modal encoder | |
CN109597876A (en) | A kind of more wheels dialogue answer preference pattern and its method based on intensified learning | |
CN112800323A (en) | Intelligent teaching system based on deep learning | |
CN110175228A (en) | Based on basic module and the loop embedding of machine learning dialogue training method and system | |
CN115545160B (en) | Knowledge tracking method and system for multi-learning behavior collaboration | |
CN111563146A (en) | Inference-based difficulty controllable problem generation method | |
CN112990464A (en) | Knowledge tracking method and system | |
CN113239209A (en) | Knowledge graph personalized learning path recommendation method based on RankNet-transformer | |
CN114969298A (en) | Video question-answering method based on cross-modal heterogeneous graph neural network | |
CN116136870A (en) | Intelligent social conversation method and conversation system based on enhanced entity representation | |
CN116611442A (en) | Interest point recommendation method based on deep semantic extraction | |
CN114266340A (en) | Knowledge query network model introducing self-attention mechanism | |
CN114676903A (en) | Online prediction method and system based on time perception and cognitive diagnosis | |
Ma et al. | Dtkt: An improved deep temporal convolutional network for knowledge tracing | |
Lin et al. | Fuzzy sets theory preliminary | |
CN113239678A (en) | Multi-angle attention feature matching method and system for answer selection | |
Wu et al. | Muscle Vectors as Temporally Dense" Labels" | |
CN117422062A (en) | Test question generation method based on course knowledge network and reinforcement learning | |
CN114117033B (en) | Knowledge tracking method and system | |
US20220222553A1 (en) | Learning content evaluation apparatus, system, and operation method for evaluating problem based on predicted probability of correct answer for problem content added without solving experience |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |