CN117992924A - HyperMixer-based knowledge tracking method - Google Patents

HyperMixer-based knowledge tracking method Download PDF

Info

Publication number
CN117992924A
CN117992924A CN202410389529.0A CN202410389529A CN117992924A CN 117992924 A CN117992924 A CN 117992924A CN 202410389529 A CN202410389529 A CN 202410389529A CN 117992924 A CN117992924 A CN 117992924A
Authority
CN
China
Prior art keywords
sequence
mixer
hypermixer
label
knowledge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410389529.0A
Other languages
Chinese (zh)
Other versions
CN117992924B (en
Inventor
王俊
陈恳
李子杰
王明杰
夏跃龙
邹伟
周菊香
甘健侯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan Normal University
Original Assignee
Yunnan Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan Normal University filed Critical Yunnan Normal University
Priority to CN202410389529.0A priority Critical patent/CN117992924B/en
Publication of CN117992924A publication Critical patent/CN117992924A/en
Application granted granted Critical
Publication of CN117992924B publication Critical patent/CN117992924B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Electrically Operated Instructional Devices (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of deep learning, in particular to a HyperMixer-based knowledge tracking method. Through HyperMixer architecture unique mixed characteristic layers, excellent performance can be obtained in different tasks, characteristic information can be mixed in a global scope, and therefore the model can capture long-scope dependency relationships and maintain detail processing of local characteristics. The method aims at solving the problem of how to enable the model to capture long-range dependency relationships and maintain detail processing of local features.

Description

HyperMixer-based knowledge tracking method
Technical Field
The invention relates to the technical field of deep learning, in particular to a HyperMixer-based knowledge tracking method.
Background
With the continuous development of artificial intelligence technology, intelligent education is also a topic of interest to researchers. The fire heat of deep learning brings new opportunities for the application of technology in education, and the use of various neural networks in educational intelligent algorithms becomes the current research popularity.
Knowledge tracking tasks are an important part of intelligent educational research, and the technology aims at predicting the future performance of students through their historical interaction records, which are usually the situations that students answer to problems in the past, by which teachers and learners use learning platforms to grasp the knowledge state of current learners and diagnose which knowledge points the students have not grasped enough through the knowledge state, so as to actively adjust the following teaching strategies or learning paths.
From the perspective of sequence modeling, existing knowledge tracking methods can be largely divided into cyclic neural network-based methods and transform-based methods. These methods follow the same settings for other sequential tasks, all of which encode historical information using an encoder, and then choose the hidden state for the corresponding time step to predict.
However, when the encoder processes the history information, the history information may lose some important details or context information, which easily causes the encoder to lose some local feature details in the sequence data; and when the hidden state corresponding to the time step is selected for prediction, the model may be too dependent on recent history information to capture long-term information, so that long-range dependency relationship in the sequence data cannot be captured.
The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that the foregoing is prior art.
Disclosure of Invention
The invention mainly aims to provide a HyperMixer-based knowledge tracking method, which aims to solve the problem of how to enable a model to capture long-range dependency relationships and maintain detail processing of local features.
In order to achieve the above object, the present invention provides a HyperMixer-based knowledge tracking method, which includes:
acquiring an interaction sequence corresponding to a learner at a target moment;
determining exercise labels, knowledge point labels and answer situation labels in the interaction sequence;
embedding the exercise label, the knowledge point label and the answer condition label;
Capturing sequence full-length features in the embedded interaction sequence based on HyperMixer global feature mixer; and
Capturing interaction characteristics in a preset historical time period in the embedded interaction sequence based on HyperMixe local characteristic mixer;
And fusing the full-length characteristics of the sequences with the interactive characteristics to obtain a characteristic mixer result, and predicting the training result, the knowledge point result and the answer situation result of the learner at the next moment according to the characteristic mixer result.
Optionally, the step of embedding the exercise tag, the knowledge point tag, and the answer situation tag includes:
embedding each exercise label in the interaction sequence to obtain an embedding matrix Wherein, the method comprises the steps of, wherein,D represents an embedding dimension for the total number of exercises of the learner;
Embedding each knowledge point label in the interaction sequence to obtain an embedded matrix Wherein, the method comprises the steps of, wherein,D represents the embedding dimension for the total number of knowledge points of the learner;
embedding the answer condition to obtain an embedded matrix
Optionally, the HyperMixer-based global feature mixer includes a global sequence mixerAnd global channel mixerThe expression for capturing the full-length sequence features in the embedded interactive sequence based on HyperMixer global feature mixer is as follows:
Wherein,
In the method, in the process of the invention,For all features in the interaction sequence after embedding,AndThe weight matrices of Sequence Mixer and Channel Mixer respectively,AndFor the bias vector, GELU is a nonlinear activation function,Representing the inputs of the sequence mixer and the global channel mixer.
Optionally, the saidAnd saidBy parameterized functionsGenerating, by another MLP, a weight matrix from each token independently
In the method, in the process of the invention,Is a vector of additional information by position coding,Representing a multi-layer perception mechanism.
Optionally, the HyperMixe-based local feature mixer includes a local sequence mixerAnd partial channel mixerThe step of capturing the interaction characteristics in the preset history time period in the embedded interaction sequence based on HyperMixe local characteristic mixer comprises the following steps:
Wherein, For the interactive features in the interactive sequence within a preset history period,
In the method, in the process of the invention,AndRespectively areAndA weighting matrix for Channel Mixer,AndAs a bias vector GELU is a nonlinear activation function.
Optionally, the saidAnd saidBy parameterized functionsGenerating, by another MLP, a weight matrix from each token independently
In the method, in the process of the invention,Is a vector of additional information by position coding,Representing a multi-layer perception mechanism.
Optionally, the step of fusing the full-length sequence feature and the interactive feature to obtain a feature mixer result, and predicting the training result, the knowledge point result and the answer situation result of the learner at the next moment according to the feature mixer result includes:
splicing the sequence full-length characteristic and the interactive characteristic to obtain a spliced characteristic
Characterizing the spliceAs the sequence characteristic of the current moment, predicting the exercise result of the next momentKnowledge point resultsAnd answer case results
Wherein,The sigmoid function is represented as a function,Is a fully connected layer parameter that reduces dimensions through a neural network while adjusting output toIs used in the range of (a),Respectively an exercise label, a knowledge point label and a response situation label at the time t+1.
Optionally, after the step of fusing the full-length sequence feature and the interactive feature to obtain a feature mixer result and predicting the training result, the knowledge point result and the answer situation result of the learner at the next moment according to the feature mixer result, the method further includes:
Calculating a score corresponding to each interaction sequence of the learner at each time step based on a bi-classification cross entropy loss function, wherein the loss of a single sequence is expressed as:
In the method, in the process of the invention, Real labels for students when interacting.
In addition, to achieve the above object, the present invention further provides a knowledge tracking system based on HyperMixer, the knowledge tracking system based on HyperMixer includes: a memory, a processor, and a HyperMixer-based knowledge tracking program stored on the memory and executable on the processor, the HyperMixer-based knowledge tracking program, when executed by the processor, implementing the steps of HyperMixer-based knowledge tracking method as claimed in any one of the preceding claims.
In addition, to achieve the above object, the present invention further provides a computer-readable storage medium having stored thereon a knowledge tracking program based on HyperMixer, which when executed by a processor, implements the steps of the knowledge tracking method based on HyperMixer as set forth in any one of the above.
The invention provides a HyperMixer-based knowledge tracking method, a HyperMixer-based knowledge tracking device and a computer-readable storage medium, which can obtain excellent performance in different tasks through a HyperMixer-structured unique mixed feature layer, and can mix feature information in a global range, so that a model can capture long-range dependency relationships and can maintain detail processing of local features.
Drawings
FIG. 1 is a schematic architecture diagram of a hardware operating environment of HyperMixer-based knowledge tracking system in accordance with an embodiment of the invention;
fig. 2 is a flowchart of a first embodiment of a HyperMixer-based knowledge tracking method according to the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
In order to better understand the above technical solution, exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As an implementation manner, fig. 1 is a schematic architecture diagram of a hardware running environment of a HyperMixer-based knowledge tracking system according to an embodiment of the present invention.
As shown in fig. 1, the HyperMixer-based knowledge tracking system may include: a processor 1001, such as a CPU, memory 1005, user interface 1003, network interface 1004, communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
Those skilled in the art will appreciate that the HyperMixer-based knowledge tracking system architecture shown in fig. 1 does not constitute a limitation of the HyperMixer-based knowledge tracking system, and may include more or fewer components than shown, or may combine certain components, or may be a different arrangement of components.
As shown in fig. 1, an operating system, a network communication module, a user interface module, and a HyperMixer-based knowledge tracking program may be included in the memory 1005 as one type of storage medium. Wherein the operating system is a program that manages and controls hardware and software resources of the HyperMixer-based knowledge tracking system, hyperMixer-based knowledge tracking program, and other software or program runs.
In the HyperMixer-based knowledge tracking system shown in fig. 1, the user interface 1003 is mainly used for connecting a terminal, and is in data communication with the terminal; the network interface 1004 is mainly used for a background server and is in data communication with the background server; the processor 1001 may be configured to invoke HyperMixer-based knowledge tracking programs stored in the memory 1005.
In this embodiment, hyperMixer-based knowledge tracking system includes: a memory 1005, a processor 1001, and a HyperMixer-based knowledge tracking program stored on the memory and executable on the processor, wherein:
when the processor 1001 calls the HyperMixer-based knowledge tracking program stored in the memory 1005, the following operations are performed:
acquiring an interaction sequence corresponding to a learner at a target moment;
determining exercise labels, knowledge point labels and answer situation labels in the interaction sequence;
embedding the exercise label, the knowledge point label and the answer condition label;
Capturing sequence full-length features in the embedded interaction sequence based on HyperMixer global feature mixer; and
Capturing interaction characteristics in a preset historical time period in the embedded interaction sequence based on HyperMixe local characteristic mixer;
And fusing the full-length characteristics of the sequences with the interactive characteristics to obtain a characteristic mixer result, and predicting the training result, the knowledge point result and the answer situation result of the learner at the next moment according to the characteristic mixer result.
When the processor 1001 calls the HyperMixer-based knowledge tracking program stored in the memory 1005, the following operations are performed:
embedding each exercise label in the interaction sequence to obtain an embedding matrix Wherein, the method comprises the steps of, wherein,D represents an embedding dimension for the total number of exercises of the learner;
Embedding each knowledge point label in the interaction sequence to obtain an embedded matrix Wherein, the method comprises the steps of, wherein,D represents the embedding dimension for the total number of knowledge points of the learner;
embedding the answer condition to obtain an embedded matrix
When the processor 1001 calls the HyperMixer-based knowledge tracking program stored in the memory 1005, the following operations are performed:
splicing the sequence full-length characteristic and the interactive characteristic to obtain a spliced characteristic
Characterizing the spliceAs the sequence characteristic of the current moment, predicting the exercise result of the next momentKnowledge point resultsAnd answer case results
Wherein,The sigmoid function is represented as a function,Is a fully connected layer parameter that reduces dimensions through a neural network while adjusting output toIs used in the range of (a),Respectively an exercise label, a knowledge point label and a response situation label at the time t+1.
When the processor 1001 calls the HyperMixer-based knowledge tracking program stored in the memory 1005, the following operations are performed:
Calculating a score corresponding to each interaction sequence of the learner at each time step based on a bi-classification cross entropy loss function, wherein the loss of a single sequence is expressed as:
In the method, in the process of the invention, Real labels for students when interacting.
Based on the hardware architecture of the HyperMixer knowledge tracking system based on the deep learning technology, an embodiment of the HyperMixer knowledge tracking method is provided.
Referring to fig. 2, in a first embodiment, the HyperMixer-based knowledge tracking method includes the steps of:
step S10, an interaction sequence corresponding to a learner at a target moment is obtained;
Step S20, determining exercise labels, knowledge point labels and response condition labels in the interaction sequence;
In this embodiment, a historical time step in the interaction sequence of the learner is selected as the target time t, and the training label, the knowledge point label and the answer condition label in the interaction sequence of the target time t are determined.
The interaction sequence records the interaction condition of the learner in the learning process, and in the embodiment, the interaction condition mainly comprises a training label, a knowledge point label and a response condition label, and the training label, the knowledge point label and the response condition label of the learner in a specific time period are determined by analyzing the interaction sequence, so that the model can identify the learning performance of the learner.
The exercise labels reflect the learner's performance in different exercise tasks including, but not limited to, difficulty, completion, time consumption of the exercise.
Knowledge point tags are characterized as knowledge points that a learner has involved in the learning process, and can help a model learn how the learner has mastered each knowledge point, and their association and migration between different knowledge points.
The answer condition indicia is characterized by the learner's performance in the answer process including, but not limited to, accuracy of the training answer, type of error, time of answer, etc. By analyzing the answer condition labels, the model can know the error correction capability, thinking capability and self-thinking mind of a learner when the problem is solved.
Step S30, embedding the exercise label, the knowledge point label and the answer condition label;
In this embodiment, to convert the training tags, knowledge point tags, and response case tags in the sequence, the discrete tags are converted to a low-dimensional continuous vector representation, and the high-dimensional, discrete symbols or categories are encoded into a low-dimensional, continuous vector representation by embedding the tags, thereby facilitating model learning.
Optionally, how to embed the training tags, each training tag in the interaction sequence is embedded to obtain an embedding matrixWherein, the method comprises the steps of, wherein,For the total number of exercises of the learner, d represents the embedding dimension.
Optionally, how to embed the knowledge point tags, each knowledge point tag in the interaction sequence is embedded to obtain an embedding matrixWherein, the method comprises the steps of, wherein,D represents the embedding dimension for the total number of knowledge points of the learner;
alternatively, how to embed the answer cases, since the answer cases only have two cases of "opposite" or "wrong", the matrix is embedded
Step S40, capturing sequence full-length features in the embedded interaction sequence based on HyperMixer global feature mixer; capturing the interaction characteristics in a preset historical time period in the embedded interaction sequence based on HyperMixe local characteristic mixer;
In this embodiment, the HyperMixer model includes a global feature mixer and a local feature mixer, where the global feature mixer is used to capture the full-length features of the sequence in the interactive sequence after being embedded, and the local feature mixer is used to capture the interactive features in the preset historical time period in the interactive sequence after being embedded, and through the mixed feature layer with the unique HyperMixer architecture, excellent performance can be obtained in different tasks, feature information can be mixed in the global scope, so that the model can capture long-range dependency relationships, and can also maintain detail processing of the local features.
Optionally, the global feature mixer comprises a global sequence mixerAnd global channel mixerFull length features of sequencesThe expression of (2) is:
Wherein,
In the method, in the process of the invention,For all features in the interaction sequence after embedding,AndThe weight matrices of Sequence Mixer and Channel Mixer respectively,AndAs a bias vector GELU is a nonlinear activation function.
Further and optionally, the step of, in the alternative,And saidBy parameterized functionsGenerating, by another MLP, a weight matrix from each token independently
In the method, in the process of the invention,Is a vector of additional information by position coding,Representing a multi-layer perception mechanism.
Illustratively, the maximum sequence length is set toAnd the input of the global feature mixer is
In the method, in the process of the invention,For an interaction sequence in which interactions occur at time t,Wherein, the method comprises the steps of, wherein,The training label comprises an embedded sequence formed by training labels at the time t, an embedded sequence formed by knowledge point labels and an embedded sequence formed by response condition labels.
Optionally, the HyperMixe-based local feature mixer comprises a local sequence mixerAnd partial channel mixerThe step of presetting the interaction characteristics in the history time period comprises the following steps:
Wherein, For the interactive features in the interactive sequence within a preset history period,
In the method, in the process of the invention,AndRespectively areAndA weighting matrix for Channel Mixer,AndAs a bias vector GELU is a nonlinear activation function.
Further and optionally, the step of, in the alternative,AndBy parameterized functionsGenerating, by another MLP, a weight matrix from each token independently
In the method, in the process of the invention,Is a vector of additional information by position coding,Representing a multi-layer perception mechanism.
Illustratively, the last prior to capture by the local feature mixer is setThe characteristic of each moment, the input of the local Mixer is
In the method, in the process of the invention,For an interaction sequence in which interactions occur at time t,Wherein, the method comprises the steps of, wherein,The training label comprises an embedded sequence formed by training labels at the time t, an embedded sequence formed by knowledge point labels and an embedded sequence formed by response condition labels.
And S50, fusing the full-length features of the sequence and the interactive features to obtain a feature mixer result, and predicting the training result, the knowledge point result and the answer situation result of the learner at the next moment according to the feature mixer result.
In this embodiment, the obtained full-length features and interactive features of the sequence are fused to obtain a feature mixer result which is capable of capturing a long-range dependency relationship and maintaining details of local features, and a training result, a knowledge point result and a response situation result of a learner at the next moment are predicted according to the feature mixer result.
Optionally, splicing the sequence full-length feature and the interactive feature to obtain a spliced feature
Characterizing the spliceAs the sequence characteristic of the current moment, predicting the exercise result of the next momentKnowledge point resultsAnd answer case results
Wherein,The sigmoid function is represented as a function,Is a fully connected layer parameter that reduces dimensions through a neural network while adjusting output toIs used in the range of (a),Respectively an exercise label, a knowledge point label and a response situation label at the time t+1.
Furthermore, for training the model, a score corresponding to each interaction sequence of the learner at each time step is calculated based on a bi-categorical cross-entropy loss function, wherein the loss of a single sequence is expressed as:
In the method, in the process of the invention, For the real labels of students in interaction, an Adam optimizer is used for optimization, and the training target is to minimize Loss.
In the technical scheme provided by the embodiment, through the HyperMixer architecture unique mixed feature layer, excellent performance can be obtained in different tasks, and feature information can be mixed in a global scope, so that the model can capture long-scope dependency relationships and can maintain detail processing of local features.
In addition, to verify the effect of the present invention, performance comparison was performed with a baseline model such as DKT, SAKT, SAINT by two modes of solution Level and KC Level under the setting of five-fold intersection. All experiments were completed with the same platform: the compute core is provided by an Intel Xeon E-2288G eight-core processor, and the GPU core is supported by NVIDIA A100 PCIe (80 GB HBM 2E) to implement the high-load deep learning task. During training, early stop strategy is used for training, and the number of training rounds is countedEmbedding dimension D of exerciseDropout is set to 0.2 in order to prevent overfitting.
Table 1 shows the behavior of the query Level (All-in-One) and KC Level (All-in-One) on AS2009 and AL2005 data sets with the baseline model. The method exceeds other baseline models in both the Question Level and the KC Level performance on both dataset datasets. On the Question Level, the accuracy of the method is improved by 1.05% compared with a DKT model on an AS2009 data set, 4% compared with a SAKT model, and 1.17% compared with the DKT model on an AL2005 data set; on the KC Level, the method is improved to a certain extent, the accuracy rate is improved by 1.62% compared with a DKT model, 4.96% compared with a SAKT model, 0.47% compared with a DKT model and 5.11% compared with a SAKT model on an AS2009 data set, and the result shows that even without adopting a complex sequence model structure, the pure MLP architecture can effectively capture the knowledge state and learning progress of students.
Furthermore, it will be appreciated by those of ordinary skill in the art that implementing all or part of the processes in the methods of the above embodiments may be accomplished by computer programs to instruct related hardware. The computer program comprises program instructions, and the computer program may be stored in a storage medium, which is a computer readable storage medium. The program instructions are executed by at least one processor in the HyperMixer-based knowledge tracking system to implement the flow steps of the embodiments of the method described above.
Accordingly, the present invention also provides a computer-readable storage medium storing a HyperMixer-based knowledge tracking program which, when executed by a processor, implements the steps of the HyperMixer-based knowledge tracking method described in the above embodiment.
The computer readable storage medium may be a usb disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk, etc. which may store the program code.
It should be noted that, because the storage medium provided in the embodiments of the present application is a storage medium used for implementing the method in the embodiments of the present application, based on the method described in the embodiments of the present application, a person skilled in the art can understand the specific structure and the modification of the storage medium, and therefore, the description thereof is omitted herein. All storage media adopted by the method of the embodiment of the application belong to the scope of protection of the application.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flowchart and/or block of the flowchart illustrations and/or block diagrams, and combinations of flowcharts and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (10)

1. A HyperMixer-based knowledge tracking method, the method comprising:
acquiring an interaction sequence corresponding to a learner at a target moment;
determining exercise labels, knowledge point labels and answer situation labels in the interaction sequence;
embedding the exercise label, the knowledge point label and the answer condition label;
Capturing sequence full-length features in the embedded interaction sequence based on HyperMixer global feature mixer; and
Capturing interaction characteristics in a preset historical time period in the embedded interaction sequence based on HyperMixe local characteristic mixer;
And fusing the full-length characteristics of the sequences with the interactive characteristics to obtain a characteristic mixer result, and predicting the training result, the knowledge point result and the answer situation result of the learner at the next moment according to the characteristic mixer result.
2. The method of claim 1, wherein the step of embedding the practice label, the knowledge point label, and the response situation label comprises:
embedding each exercise label in the interaction sequence to obtain an embedding matrix Wherein/>D represents an embedding dimension for the total number of exercises of the learner;
Embedding each knowledge point label in the interaction sequence to obtain an embedded matrix Wherein/>D represents the embedding dimension for the total number of knowledge points of the learner;
embedding the answer condition to obtain an embedded matrix
3. The method of claim 1, wherein the HyperMixer-based global feature mixer comprises a global sequence mixerAnd global channel mixer/>The expression for capturing the full-length sequence features in the embedded interactive sequence based on HyperMixer global feature mixer is as follows:
Wherein,
In the method, in the process of the invention,For all features in the interaction sequence after embedding,/>And/>Weighting matrices of Sequence Mixer and Channel Mixer, respectively,/>And/>Is a bias vector, GELU is a nonlinear activation function,/>Representing the inputs of the sequence mixer and the global channel mixer.
4. The method of claim 3, wherein,
The saidAnd said/>By parameterized function/>Generating, generating a weight matrix/>, independently from each token by another MLP
In the method, in the process of the invention,Is a vector of additional information by position coding,/>Representing a multi-layer perception mechanism.
5. The method of claim 1, wherein the HyperMixe-based local feature mixer comprises a local sequence mixerAnd local channel mixer/>The step of capturing the interaction characteristics in the preset history time period in the embedded interaction sequence based on HyperMixe local characteristic mixer comprises the following steps:
Wherein, For the interactive features in the interactive sequence within a preset history period,
In the method, in the process of the invention,And/>Are respectively/>And/>Weight matrix of Channel Mixer,/>And/>As a bias vector GELU is a nonlinear activation function.
6. The method of claim 5, wherein,
The saidAnd said/>By parameterized function/>Generating, generating a weight matrix/>, independently from each token by another MLP
In the method, in the process of the invention,Is a vector of additional information by position coding,/>Representing a multi-layer perception mechanism.
7. The method of claim 1, wherein the step of fusing the full-length features of the sequence with the interactive features to obtain feature mixer results, and predicting the learner's exercise results, knowledge point results, and response situation results at a next time based on the feature mixer results comprises:
splicing the sequence full-length characteristic and the interactive characteristic to obtain a spliced characteristic
Characterizing the spliceAs the sequence characteristic of the current moment, predicting the exercise result/>, of the next momentKnowledge point resultsSum answer case results/>
Wherein,Representing a sigmoid function,/>Is a fully connected layer parameter that reduces dimensions through a neural network while adjusting output to/>Interval of/>、/>、/>Respectively an exercise label, a knowledge point label and a response situation label at the time t+1.
8. The method of claim 1, wherein after the step of fusing the full-length features of the sequence with the interactive features to obtain feature mixer results and predicting the training results, knowledge point results and response situation results of the learner at the next time based on the feature mixer results, further comprising:
Calculating a score corresponding to each interaction sequence of the learner at each time step based on a bi-classification cross entropy loss function, wherein the loss of a single sequence is expressed as:
In the method, in the process of the invention, Real labels for students when interacting.
9. A HyperMixer-based knowledge tracking system, the HyperMixer-based knowledge tracking system comprising: a memory, a processor, and a HyperMixer-based knowledge tracking program stored on the memory and executable on the processor, the HyperMixer-based knowledge tracking program, when executed by the processor, implementing the steps of HyperMixer-based knowledge tracking method as claimed in any one of claims 1 to 8.
10. A computer readable storage medium, characterized in that it has stored thereon a HyperMixer based knowledge tracking program, which when executed by a processor implements the steps of HyperMixer based knowledge tracking method according to any of claims 1 to 8.
CN202410389529.0A 2024-04-02 2024-04-02 HyperMixer-based knowledge tracking method Active CN117992924B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410389529.0A CN117992924B (en) 2024-04-02 2024-04-02 HyperMixer-based knowledge tracking method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410389529.0A CN117992924B (en) 2024-04-02 2024-04-02 HyperMixer-based knowledge tracking method

Publications (2)

Publication Number Publication Date
CN117992924A true CN117992924A (en) 2024-05-07
CN117992924B CN117992924B (en) 2024-06-07

Family

ID=90902311

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410389529.0A Active CN117992924B (en) 2024-04-02 2024-04-02 HyperMixer-based knowledge tracking method

Country Status (1)

Country Link
CN (1) CN117992924B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210390873A1 (en) * 2020-06-11 2021-12-16 Act, Inc. Deep knowledge tracing with transformers
US20220121871A1 (en) * 2020-10-16 2022-04-21 Tsinghua University Multi-directional scene text recognition method and system based on multi-element attention mechanism
KR20220050037A (en) * 2020-10-15 2022-04-22 (주)뤼이드 User knowledge tracking device, system and operation method thereof based on artificial intelligence learning
CN114429132A (en) * 2022-02-24 2022-05-03 南京航空航天大学 Named entity identification method and device based on mixed lattice self-attention network
CN114781710A (en) * 2022-04-12 2022-07-22 云南师范大学 Knowledge tracking method for difficulty characteristics of knowledge points in comprehensive learning process and questions
CN114861914A (en) * 2022-03-29 2022-08-05 华中师范大学 Learning trajectory-oriented fine-grained knowledge tracking method
CN115545160A (en) * 2022-09-26 2022-12-30 长江大学 Knowledge tracking method and system based on multi-learning behavior cooperation
US20230024169A1 (en) * 2021-07-23 2023-01-26 Riiid Inc. Method and apparatus for predicting test scores
CN116012627A (en) * 2022-11-11 2023-04-25 浙江师范大学 Causal time sequence dual-enhancement knowledge tracking method based on hypergraph clustering
CN116051320A (en) * 2022-12-02 2023-05-02 中国人民解放军国防科技大学 Multitasking attention knowledge tracking method and system for online learning platform
CN116166998A (en) * 2023-04-25 2023-05-26 合肥师范学院 Student performance prediction method combining global and local features
CN116777695A (en) * 2023-06-26 2023-09-19 陕西师范大学 Time sequence convolution knowledge tracking method for fusion project reaction

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210390873A1 (en) * 2020-06-11 2021-12-16 Act, Inc. Deep knowledge tracing with transformers
KR20220050037A (en) * 2020-10-15 2022-04-22 (주)뤼이드 User knowledge tracking device, system and operation method thereof based on artificial intelligence learning
US20220121871A1 (en) * 2020-10-16 2022-04-21 Tsinghua University Multi-directional scene text recognition method and system based on multi-element attention mechanism
US20230024169A1 (en) * 2021-07-23 2023-01-26 Riiid Inc. Method and apparatus for predicting test scores
CN114429132A (en) * 2022-02-24 2022-05-03 南京航空航天大学 Named entity identification method and device based on mixed lattice self-attention network
CN114861914A (en) * 2022-03-29 2022-08-05 华中师范大学 Learning trajectory-oriented fine-grained knowledge tracking method
CN114781710A (en) * 2022-04-12 2022-07-22 云南师范大学 Knowledge tracking method for difficulty characteristics of knowledge points in comprehensive learning process and questions
CN115545160A (en) * 2022-09-26 2022-12-30 长江大学 Knowledge tracking method and system based on multi-learning behavior cooperation
CN116012627A (en) * 2022-11-11 2023-04-25 浙江师范大学 Causal time sequence dual-enhancement knowledge tracking method based on hypergraph clustering
CN116051320A (en) * 2022-12-02 2023-05-02 中国人民解放军国防科技大学 Multitasking attention knowledge tracking method and system for online learning platform
CN116166998A (en) * 2023-04-25 2023-05-26 合肥师范学院 Student performance prediction method combining global and local features
CN116777695A (en) * 2023-06-26 2023-09-19 陕西师范大学 Time sequence convolution knowledge tracking method for fusion project reaction

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
LIANG J等: "MMMLP: multi-modal multilayer perceptron for sequential recommendations", 《PROCEEDINGS OF THE ACM WEB CONFERENCE》, 30 April 2023 (2023-04-30), pages 1109 - 1117 *
LIU X等: "TCAMixer: A lightweight Mixer based on a novel triple concepts attention mechanism for NLP", 《ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE》, 31 December 2023 (2023-12-31), pages 1 - 8 *
LIU Z等: "simpleKT: a simple but tough-to-beat baseline for knowledge tracing", 《ARXIV:2302.06881V2 》, 23 February 2023 (2023-02-23), pages 1 - 13 *
Y. LI等: "Knowledge Representation Learning with Hyperboloid Models", 《2023 16TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER THEORY AND ENGINEERING (ICACTE)》, 12 December 2023 (2023-12-12), pages 70 - 74 *
李子杰等: "序列特征与学习过程融合的知识追踪模型", 《计算机工程》, 18 April 2024 (2024-04-18), pages 1 - 13 *

Also Published As

Publication number Publication date
CN117992924B (en) 2024-06-07

Similar Documents

Publication Publication Date Title
von Davier Computational psychometrics in support of collaborative educational assessments
US20210256354A1 (en) Artificial intelligence learning-based user knowledge tracing system and operating method thereof
KR102013955B1 (en) Smart education system for software expert practical affairs education and estimation and method thereof
CN109464803A (en) Virtual objects controlled, model training method, device, storage medium and equipment
CN107544960B (en) Automatic question-answering method based on variable binding and relation activation
CN115545160B (en) Knowledge tracking method and system for multi-learning behavior collaboration
CN116051320A (en) Multitasking attention knowledge tracking method and system for online learning platform
Swamy et al. Deep knowledge tracing for free-form student code progression
CN115310520A (en) Multi-feature-fused depth knowledge tracking method and exercise recommendation method
Barba et al. Tangible media approaches to introductory computer science
Yi [Retracted] Research on English Teaching Reading Quality Evaluation Method Based on Cognitive Diagnostic Evaluation
CN117992924B (en) HyperMixer-based knowledge tracking method
CN111861820A (en) Learning plan generation method and device
Dehbi et al. MOOCs in smart education: Comparative study by applying AHP and COPRAS method
CN114117033B (en) Knowledge tracking method and system
Cabada et al. Intelligent tutoring system with affective learning for mathematics
CN114328460A (en) Method and device for intelligently generating set of questions, computer readable storage medium and electronic equipment
Lao et al. A deep learning practicum: Concepts and practices for teaching actionable machine learning at the tertiary education level
Yevtushenko et al. Using Artificial Intelligence Technologies to Predict and Identify the Educational Process.
CN115130430A (en) Test paper generation method and device, storage medium and computer equipment
Li Intelligent interactive english teaching discrete data modeling and simulation
CN117973527B (en) Knowledge tracking method based on GRU capturing problem context characteristics
Nurjaman The challenge of implementing smart learning: learning behavior readiness for indonesian students
Li Data-driven adaptive learning systems
Pynadath et al. Socio-cultural modeling through decision-theoretic agents with theory of mind

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant