CN114595923A - Group teaching recommendation system based on deep reinforcement learning - Google Patents
Group teaching recommendation system based on deep reinforcement learning Download PDFInfo
- Publication number
- CN114595923A CN114595923A CN202210028554.7A CN202210028554A CN114595923A CN 114595923 A CN114595923 A CN 114595923A CN 202210028554 A CN202210028554 A CN 202210028554A CN 114595923 A CN114595923 A CN 114595923A
- Authority
- CN
- China
- Prior art keywords
- student
- model
- recommendation
- group
- teaching
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000002787 reinforcement Effects 0.000 title claims abstract description 15
- 238000012549 training Methods 0.000 claims abstract description 65
- 238000000034 method Methods 0.000 claims abstract description 34
- 230000008569 process Effects 0.000 claims abstract description 26
- 230000006870 function Effects 0.000 claims abstract description 17
- 238000005457 optimization Methods 0.000 claims abstract description 10
- 230000002452 interceptive effect Effects 0.000 claims abstract description 4
- 238000013528 artificial neural network Methods 0.000 claims description 50
- 238000003062 neural network model Methods 0.000 claims description 37
- 238000011156 evaluation Methods 0.000 claims description 20
- 238000012360 testing method Methods 0.000 claims description 19
- 238000004422 calculation algorithm Methods 0.000 claims description 14
- 238000013523 data management Methods 0.000 claims description 14
- 238000007726 management method Methods 0.000 claims description 11
- 238000012545 processing Methods 0.000 claims description 8
- 238000005070 sampling Methods 0.000 claims description 5
- 230000006399 behavior Effects 0.000 claims description 4
- 230000007704 transition Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 2
- 238000010276 construction Methods 0.000 claims description 2
- 230000000306 recurrent effect Effects 0.000 claims description 2
- 238000013439 planning Methods 0.000 abstract description 2
- 230000003993 interaction Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 230000001174 ascending effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000008921 facial expression Effects 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000007786 learning performance Effects 0.000 description 1
- 238000012886 linear function Methods 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/067—Enterprise or organisation modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
- G06Q50/205—Education administration or guidance
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B5/00—Electrically-operated educational appliances
- G09B5/08—Electrically-operated educational appliances providing for individual presentation of information to a plurality of student stations
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Strategic Management (AREA)
- General Physics & Mathematics (AREA)
- Educational Administration (AREA)
- Human Resources & Organizations (AREA)
- Tourism & Hospitality (AREA)
- Educational Technology (AREA)
- Economics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Primary Health Care (AREA)
- Development Economics (AREA)
- Probability & Statistics with Applications (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
The invention discloses a group teaching recommendation system based on deep reinforcement learning, and belongs to the technical field of education and information. The invention collects student data through interactive methods such as voting, question answering, homework, quiz and the like in a classroom, provides a teaching plan with the maximum overall income for a given student group, and the overall income can be expressed by a multi-objective optimization function, and specifically can include but not limited to passing rate, excellent rate, average rate and the like. The invention uses a deep reinforcement learning method to carry out target-oriented teaching path planning for teachers, and can process large-scale complex data. Meanwhile, the training process which takes most of the time is put before and after class, and the teacher can immediately obtain recommended teaching knowledge points through the class feedback of students in the class.
Description
Technical Field
The invention relates to the technical field of education and information, in particular to a group teaching recommendation system based on deep reinforcement learning.
Background
In conventional classroom teaching, a teacher often arranges learning content according to experience because the learning details of students are invisible and uncontrolled. Of course, the teacher may have a variety of information, including questions and answers, classroom assessment, and the student's facial expressions, gestures, and body movements to assess the student's learning performance. However, this information is often crude and cannot cover every student or track every person's learning details, which makes teachers often unable to design teaching paths on a fine-grained basis. The development of the teaching auxiliary system relieves the difficulty faced by teachers. The teaching auxiliary system provides various teacher-student interaction methods, can record interaction information, and teachers can more accurately and deeply know student conditions through the interaction information. On the other hand, the teaching auxiliary system can also provide recommended teaching plans or learning plans for teachers or students, and work pressure of the teachers is relieved to a greater extent.
Patent application publication No. CN 112700688A discloses an intelligent classroom teaching assistance system. Student learning data are collected through interaction methods such as classroom voting, modeling tracking is carried out on students on the basis of the data, and finally a recommended teaching plan is given according to models of students in a whole class. However, the recommendation algorithm simulates the learning process of students in various teaching plans based on the current student model, and finally selects the teaching plan with the best simulation effect as the recommendation. In order to obtain better recommendation effect, the situation under all possible teaching plans needs to be simulated as much as possible, which brings great calculation amount and time consumption. With more students and more knowledge points, the resulting long wait may be unacceptable, resulting in the teacher not being able to get timely feedback in the classroom.
Disclosure of Invention
The invention provides a group teaching recommendation system based on deep reinforcement learning, which can be used for improving the processing efficiency of group teaching recommendation.
The technical scheme adopted by the invention is as follows:
a group education recommendation system based on deep reinforcement learning, the system comprising: the system comprises a user terminal, a knowledge point management module, a student data management module, a student model module, a pre-training module and a group teaching recommendation module;
the user terminal is used for teachers or students to log in the system and is an interactive input and output terminal of the user and the system;
the knowledge point management module is used for a teacher user to input knowledge point data and send the knowledge point data to the student model module and the pre-training module group teaching recommendation module;
the student data management module is used for inputting student basic data by a student user and sending the student basic data to the student model module; the student feedback acquisition module is used for acquiring student classroom feedback in a classroom and sending the classroom feedback to the group teaching recommendation module;
the student model module creates a student model based on currently input knowledge point data and student basic data according to a preset creating strategy and sends the student model to the pre-training module;
the pre-training module takes a student model created by the student model module as a learning main body, takes data sent by the knowledge point management module and the student data management module as training data, and trains a preset initial group recommendation model to obtain a trained group recommendation model; the initial group recommendation model comprises a first neural network model and a second neural network model, the first neural network model and the second neural network model comprise an input layer, at least one hidden layer and an output layer, wherein the input layer is a student classroom feedback data sequence, the hidden layer is a neural network capable of processing sequence input, and the output layer of the first neural network model is used for outputting recommendation degrees of all knowledge points of a current course; the output layer of the second neural network model is used for outputting an evaluation value of the current classroom teaching, namely an evaluation value of a currently executed teaching behavior; the group teaching recommendation module calls a group recommendation model trained by the pre-training module, and outputs teaching recommendation information in combination with student classroom feedback of each class of the curriculum in the course teaching process and sends the teaching recommendation information to corresponding teacher users; and storing student classroom feedback collected by the student data management module; updating and training the group recommendation model based on the classroom feedback of students stored in the current period according to the configured model updating period in the course teaching process;
the output teaching recommendation information comprises a recommendation knowledge point of the next class and an evaluation value of a feedback data sequence of the current student class, wherein the recommendation knowledge point of the next class is the knowledge point with the maximum recommendation degree;
further, the knowledge point data includes: knowledge point ID, belonging course name, knowledge point introduction, knowledge point content, knowledge point difficulty coefficient, preposed knowledge point ID of the knowledge point, classroom test question matched with the knowledge point, and knowledge point related data.
Further, the student basic data includes: student's school number, name, age, gender, age and student type; the student classroom feedback comprises the following data: the test subject name, the affiliated knowledge point ID, the test subject content, the ID of the participating test student, the test result of the student and the like.
Further, the student model module simulates a group recommendation model training process of real students in the pre-training module by using a student model, and a construction model of the student model is an Ebinghao memory model, a half-life memory model or a Bayesian knowledge tracking model;
and the description of the model includes:
describing the current grasp state of the virtual student for each knowledge point;
a process describing how a virtual student transitions from one state to another by learning;
classroom feedback after learning is described.
Further, the training of the initial group recommendation model by the pre-training module comprises:
the student model created by the student model module is used as a virtual student to form a class for training;
setting course requirement information and initializing network parameters of the initial group recommendation model;
the method comprises the steps of taking virtual students in a whole class as an environment, taking a first neural network model and a second neural network model of an initial group recommendation model as intelligent agents, training the intelligent agents by adopting a near-end strategy optimization algorithm, and storing current network parameters when preset training end conditions are met to obtain the trained group recommendation model.
The course requirement information includes: the number of lessons, the passing rate, the excellent rate, the average score and the like which need to be achieved when the lessons are finished.
Further, training the agent using a near-end strategy optimization algorithm includes:
step S1: recording the initial state of the virtual student;
step S2: judging whether the first cycle number reaches a preset first maximum cycle number, if so, executing a step S3; otherwise, the following processing is executed in a loop:
step S201: resetting the virtual student status to the initial status recorded in step S1;
step S202: circularly executing the step S202-1 to the step S202-4 until the circulation times reach the preset maximum sub-circulation times; the student classroom feedback of the virtual students in each cycle, the recommendation degree of the knowledge points output by the first neural network, the evaluation value output by the second neural network model and the reward value obtained by calculating the knowledge points learned last time according to the course requirement information through the classroom feedback of all the virtual students are recorded;
step S202-1: all the virtual students participate in classroom learning, and the virtual students give classroom feedback;
step S202-2: forming a student classroom feedback data sequence by the student classroom feedback given in the step S202-1, inputting the student classroom feedback data sequence into a first neural network, and obtaining a knowledge point of the next teaching based on the output of the student classroom feedback data sequence, namely taking the knowledge point with the maximum recommendation degree as the knowledge point of the next teaching;
step S202-3: forming a student classroom feedback data sequence by the student classroom feedback given in the step S202-1, inputting the student classroom feedback data sequence into a second neural network model, and obtaining an evaluation value of the student classroom feedback data sequence based on the output of the student classroom feedback data sequence;
step S202-4: all the virtual students learn the next teaching knowledge point obtained based on the first neural network;
step S3: judging whether a preset second maximum cycle number is reached, if so, ending; otherwise, the following processing is executed in a loop:
step S301: sampling the student classroom feedback data collected in the step S2;
step S302: calculating a first objective function (namely the output loss of the first neural network) based on the sampled data, and adjusting network parameters of the first neural network according to a preset random gradient rising algorithm;
step S303: calculating a second objective function (namely the output loss of the second neural network model) based on the sampled data, and adjusting the network parameters of the second neural network model according to a preset random gradient ascending algorithm;
the recommendation and training process of the group teaching recommendation module comprises the following steps:
initializing a group recommendation model, and initializing by using the network parameters stored after the training of a pre-training module is finished;
after the teacher starts teaching, forming a student classroom feedback data sequence based on student classroom feedback of students in each class through a user terminal, and respectively inputting a first neural network model and a second neural network model of a group recommendation model; obtaining recommended teaching knowledge points of the next classroom and corresponding evaluation values based on the output of the recommended teaching knowledge points, and storing the student classroom feedback data sequence, the recommended teaching knowledge points and the evaluation values; and sending the recommended teaching knowledge points of the next classroom to the corresponding teachers;
and after the class, updating and training the group recommendation model based on historical data stored in the current updating period, wherein the historical data comprises a plurality of groups of data records, and each group of data comprises a student classroom feedback data sequence, a recommended teaching knowledge point and an evaluation value.
The technical scheme provided by the embodiment of the invention at least has the following beneficial effects:
compared with the prior art, the student data are collected through interactive methods such as voting, question answering, homework and quizzes in a classroom, a teaching plan with the maximum overall income is provided for a given student group (such as the whole class), and the overall income can be represented by a multi-objective optimization function and specifically can include (but is not limited to) passing rate, excellence rate, average rate and the like. The invention uses a deep reinforcement learning method to carry out target-oriented teaching path planning for teachers, and can process large-scale complex data. Meanwhile, the training process which takes most of the time is put before and after class, and the teacher can immediately obtain recommended teaching knowledge points through the class feedback of students in the class.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a block diagram of a group teaching recommendation system based on deep reinforcement learning according to an embodiment of the present invention;
FIG. 2 is a teaching process data sequence chart of a group teaching recommendation system based on deep reinforcement learning according to an embodiment of the present invention;
FIG. 3 is a flow chart of a pre-training module of a group teaching recommendation system based on deep reinforcement learning according to an embodiment of the present invention;
FIG. 4 is a flowchart of a group teaching recommendation module of a group teaching recommendation system based on deep reinforcement learning according to an embodiment of the present invention;
fig. 5 is a clip function diagram of a group education recommendation system based on deep reinforcement learning according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
The embodiment of the invention provides a group teaching recommendation system based on deep reinforcement learning, as shown in fig. 1, the system comprises: the system comprises a user terminal (used for teachers or students to log in the system), a knowledge point management module, a student data management module, a student model module, a pre-training module and a group teaching recommendation module. The specific process of realizing group teaching recommendation through data interaction among the modules comprises the following steps:
(1) a user terminal (teacher user) inputs knowledge point data through a knowledge point management module, and the knowledge point management module sends the knowledge point data to a student model module and a pre-training module group teaching recommendation module;
(2) a user terminal (student user) inputs student basic data through a student data management module, and the student data management module sends the student basic data to a student model module and a pre-training module; in the classroom, through the interaction with the user terminal, the classroom feedback of students is collected and sent to the group teaching recommendation module;
(3) the student model module creates a student model based on the currently input related information (knowledge point data and student basic data) according to a preset creating strategy and sends the student model to the pre-training module;
(4) the pre-training module takes a student model created by the student model module as a learning main body, takes data sent by the knowledge point management module and the student data management module as training data, and trains a preset initial group recommendation model to obtain a trained group recommendation model;
(5) the group teaching recommendation module calls a group recommendation model trained by the pre-training module, and in the course teaching process, teaching recommendation information is output and sent to corresponding teacher users in combination with the classroom feedback of students of each class of the curriculum; and storing student classroom feedback collected by the student data management module; and updating and training the group recommendation model based on the classroom feedback of the students stored in the current period according to the configured model updating period in the course teaching process.
In this embodiment, the knowledge point management module is configured to: receiving and storing knowledge point data (namely knowledge point information) input by an expert; the received data is provided as a data set to other modules for use. The expert refers to a teacher or a group of teachers with profound teaching experience and familiar with the course knowledge points; the knowledge point information comprises knowledge point ID, belonging course name, knowledge point introduction, knowledge point content, knowledge point difficulty coefficient, preposed knowledge point ID of the knowledge point, classroom test question matched with the knowledge point and knowledge point related data.
In this embodiment, the student data management module is configured to: receiving and storing the student basic information input by the students; collecting and storing classroom feedback data of students in a classroom test interaction mode; the received data is provided as a data set to other modules for use. The basic information of the students comprises the study numbers, names, ages, sexes, ages and types of the students; the classroom feedback data comprises test subject names, affiliated knowledge points ID, test subject contents, ID of students participating in testing and test results of the students; the data set comprises a student basic information data set and a classroom feedback data set; the data sequence generated in the teaching process in this embodiment is shown in fig. 2.
In this embodiment, the student model module is configured to create a student model based on a student basic information dataset; and simulating a group recommendation model training process in a pre-training module by using the student model. The student model is realized through an Ebinghaos memory model, and is used for describing a plurality of characteristic information:
(1) describing the current grasping state of the virtual student for each knowledge point, the formula in this embodiment is as follows:
wherein ,PiRepresenting the probability of a student's mastery of the ith knowledge point,representing the grasping probability of the leading knowledge point of the ith knowledge point,theta is a difficulty coefficient and is determined according to the specific conditions of students and knowledge points, D is the time from the previous learning of the knowledge point to the present learning, and S is the total times of learning the knowledge point;
(2) describing the process of how a virtual student transitions from one state to another by learning, in this embodiment by changing D and S in the above formula;
(3) describing the classroom feedback after learning, in this embodiment, a random number between 0 and 1 is sampled, if the random number is smaller than P in the above formulaiThen the question that answers the correct knowledge point is considered to be answered, and the other way round is not answered.
In this embodiment, the pre-training module is configured to train a group recommendation model through a near-end policy optimization algorithm based on a student model module before class, and the flow is shown in fig. 3. The group recommendation model consists of a recommendation neural network and a comment family neural network. The recommendation neural network is a recurrent neural network, because the feedback data of the students is a sequence of data arranged according to time, so the recommendation neural network needs to be capable of processing sequence input. The critic neural network structure is similar to the recommendation neural network, namely the critic neural network structure and the recommendation neural network both comprise an input layer, a hidden layer and an output layer, wherein the input layer is used for inputting a student classroom feedback sequence, the number of the hidden layers can be one or multiple, the number of the hidden layers of the critic neural network structure and the recommendation neural network structure can be consistent or different, the output layer is the main difference of the critic neural network structure and the recommendation neural network structure, the output layer of the recommendation neural network is used for classified output, the output layer of the critic neural network structure adopts a sofmax function, and output information is used for representing the recommendation degree of each knowledge point of the current course in the next classroom (when a recommendation result is formed, the maximum recommendation degree is taken as the recommendation result); the output layer of the critic neural network adopts a Linear function, and the output information is used for representing the rating value of the behavior at each sampling moment, namely the output of the critic neural network is the evaluation (rating value) of the current classroom teaching. The group recommendation model is trained by using a near-end Policy Optimization (PPO), and the specific training process is as follows:
(1) the student models created by the student model module are used as virtual students to form a class to participate in training, and if the number of people in the class is 20;
(2) setting course requirement information;
(3) initializing a group recommendation model, namely initializing network parameters of a recommendation neural network and a comment family neural network;
(4) taking virtual students in a whole class as an environment, recommending a neural network and a commenting family neural network as an agent, and training the agent by using a near-end strategy optimization algorithm;
(5) after training of the recommended neural network and the comment family neural network is completed, network parameters of the currently recommended neural network and the comment family neural network are stored and provided for the group teaching recommendation module to use.
As a possible implementation manner, in the training process, in (2), the course requirement information includes a number of courses of 80, a passing rate required to be achieved at the end of a course is 0.8, an excellent rate is 0.2, and a higher average is better; (3) the initialized group recommendation model in (1) comprises a neural network with two layers, and a hidden layer with 64 neurons; (4) the process of the near-end strategy optimization algorithm is as follows:
(1) recording the initial state of the virtual student;
(2) the following steps are circulated for the specified times:
(2-a) the following steps are cycled for a specified number of times:
I. resetting the virtual student state to the initial state saved in (1);
II. The following steps are circulated until the learning times reach the set class time, the test result returned by the students in each circulation is recorded, the knowledge points output by the neural network are recommended, the evaluation values output by the family neural network are commented, and the reward values obtained by calculating the knowledge points learned last time according to the course requirement information through the classroom feedback of all the students are calculated, wherein the reward value formula is as follows:
Reward=λ1Rp+λ2Re+λ3Ra
wherein ,RpFinger-to-pass rate, ReMeans excellent rate, RaMeans the evaluation grasp probability, lambda, of all students to the knowledge point1,λ2 and λ3The weights of the two are represented, and the values are greater than or equal to 0, and are empirical values, which is not limited in the present invention. In this example, 5, 3, 1 are taken.
1) All the virtual students participate in the classroom test, and the virtual students return test results;
2) transmitting the classroom feedback into a recommendation neural network, and outputting a recommended knowledge point of the next teaching;
3) transmitting the classroom feedback into a comment family neural network, and outputting an evaluation value;
4) all virtual students learn and recommend knowledge points output by the neural network;
(2-b) cycle the following work by the specified number of times:
I. sampling is performed from the data collected in (2-a).
II. Calculating an objective function by using the sampled data, and selecting a random gradient ascent algorithm to train a recommended neural network, wherein the formula is as follows:
wherein ,θkRefers to the parameters of the recommended neural network at the k-th training, DkRefers to a sampled data set, τ refers to a sampled data under a group of teaching paths, i.e. a complete teaching track sample (e.g. 40 courses in class, and after the complete teaching, there are 40 courses in teaching, and the teaching of the 40 courses forms a group of data, DkComposed of many groups of tau), T is the duration of the course, piθ(at|st) When the expression parameter is theta, the input classroom feedback is s at the time ttAn output of atThe clip function diagram is as shown in FIG. 4, i.e. the input parameters of clip () include rt(θ)And represents the boundary value e, if rt(theta) is less than or equal to 1-epsilon, and clip (), then, 1-epsilon; if rtIf (θ) is equal to or greater than 1 +. epsilon, clip (), 1 +. epsilon, and if 1-. epsilon < rt(theta) < 1+ ∈, then clip () < rt(theta). In this embodiment, the value of the boundary value e is 0.1.For the dominance value of the behavior at time t, the formula is as follows:
ξt=rt+γV(st+1)-V(st)
wherein ,ξtIntermediate parameter representing time t, i.e. intermediate parameter xi at different timestAccording to xitIs calculated by the formula (2), gamma represents the discount factor, the value in this embodiment is 0.99, T represents the total time up to now, rtIndicating the prize value, V(s), earned at time tt) The comment value given by the comment family neural network at the moment t is represented;
and III, calculating an objective function by using the sampled data, and selecting a random gradient ascent algorithm to train a critic neural network formula as follows:
wherein ,refers to the parameters of the critic neural network during the k-th training,representation based on current network parametersThe output of the family neural network (comment value) is reviewed at time t.
In the embodiment, the group teaching recommendation module is used for receiving and storing classroom feedback data of students in a classroom; giving a recommended knowledge point for teaching based on classroom feedback; the group recommendation model is further trained using the classroom feedback data after the classroom. The recommendation and training process is shown in fig. 5:
(1) initializing a group recommendation model, and initializing by using parameters stored after the training of a pre-training module is finished;
(2) the teacher starts teaching, the student gives classroom feedback and inputs a recommended neural network and a commenting family neural network, the recommended neural network outputs recommended teaching knowledge points, the commenting family neural network outputs evaluation values, and all the data are stored;
(3) and (5) circularly executing the step (2), and teaching by the teacher according to the recommended knowledge points. After a certain number of times, after class, calculating an objective function by using the data stored so far, and training the group recommendation model again;
(4) and (3) circulating until the curriculum is finished.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
What has been described above are merely some embodiments of the present invention. It will be apparent to those skilled in the art that various changes and modifications can be made without departing from the inventive concept thereof, and these changes and modifications can be made without departing from the spirit and scope of the invention.
Claims (10)
1. A group teaching recommendation system based on deep reinforcement learning is characterized by comprising: the system comprises a user terminal, a knowledge point management module, a student data management module, a student model module, a pre-training module and a group teaching recommendation module;
the user terminal is used for teachers or students to log in the system and is an interactive input and output terminal of the user and the system;
the knowledge point management module is used for a teacher user to input knowledge point data and send the knowledge point data to the student model module and the pre-training module group teaching recommendation module;
the student data management module is used for inputting student basic data by a student user and sending the student basic data to the student model module; the student feedback acquisition module is used for acquiring student classroom feedback in a classroom and sending the classroom feedback to the group teaching recommendation module;
the student model module creates a student model based on currently input knowledge point data and student basic data according to a preset creating strategy and sends the student model to the pre-training module;
the pre-training module takes a student model created by the student model module as a learning main body, takes data sent by the knowledge point management module and the student data management module as training data, and trains a preset initial group recommendation model to obtain a trained group recommendation model; the initial group recommendation model comprises a first neural network model and a second neural network model, the first neural network model and the second neural network model comprise an input layer, at least one hidden layer and an output layer, wherein the input layer is a student classroom feedback data sequence, the hidden layer is a neural network capable of processing sequence input, and the output layer of the first neural network model is used for outputting recommendation degrees of all knowledge points of a current course; the output layer of the second neural network model is used for outputting the evaluation value of the current classroom teaching;
the group teaching recommendation module calls a group recommendation model trained by the pre-training module, and outputs teaching recommendation information in combination with student classroom feedback of each class of the curriculum in the course teaching process and sends the teaching recommendation information to corresponding teacher users; and storing student classroom feedback collected by the student data management module; updating and training the group recommendation model based on the classroom feedback of students stored in the current period according to the configured model updating period in the course teaching process;
the output teaching recommendation information comprises a recommendation knowledge point of the next class and an evaluation value of a feedback data sequence of the current student class, wherein the recommendation knowledge point of the next class is the knowledge point with the maximum recommendation degree.
2. The group education recommendation system of claim 1, wherein the knowledge point data includes: knowledge point ID, belonging course name, knowledge point introduction, knowledge point content, knowledge point difficulty coefficient, preposed knowledge point ID of the knowledge point, classroom test question matched with the knowledge point, and knowledge point related data.
3. The group instruction recommendation system of claim 1 wherein the student base data comprises: student's school number, name, age, gender, age and student type; the student classroom feedback comprises the following data: the test subject name, the affiliated knowledge point ID, the test subject content, the ID of the participating test student, the test result of the student and the like.
4. The group education recommendation system of claim 1 wherein the student model module simulates a group recommendation model training process in a real student participation pre-training module using a student model whose construction model is an Ebingo memory model, a half-life memory model or a Bayesian knowledge tracking model;
and the description of the model includes:
describing the current grasp state of the virtual student for each knowledge point;
a process describing how a virtual student transitions from one state to another by learning;
classroom feedback after learning is described.
5. The group instruction recommendation system of claim 1, wherein the training of the initial group recommendation model by the pre-training module comprises:
the student model created by the student model module is used as a virtual student to form a class for training;
setting course requirement information and initializing network parameters of the initial group recommendation model;
the method comprises the steps of taking virtual students in a whole class as an environment, taking a first neural network model and a second neural network model of an initial group recommendation model as intelligent agents, training the intelligent agents by adopting a near-end strategy optimization algorithm, and storing current network parameters when preset training end conditions are met to obtain the trained group recommendation model.
6. The group education recommendation system of claim 5 wherein the course requirement information includes: the number of courses, and the passing rate, excellence rate and average score to be achieved at the end of the course.
7. The group instruction recommendation system of claim 1, wherein training the agent using the near-end strategy optimization algorithm comprises:
step S1: recording the initial state of the virtual student;
step S2: judging whether the first cycle number reaches a preset first maximum cycle number, if so, executing step S3; otherwise, the following processing is executed in a loop:
step S201: resetting the virtual student status to the initial status recorded in step S1;
step S202: circularly executing the step S202-1 to the step S202-4 until the circulation times reach the preset maximum sub-circulation times; the student classroom feedback of the virtual students in each cycle, the recommendation degree of the knowledge points output by the first neural network model, the evaluation value output by the second neural network model and the reward value obtained by calculating the knowledge points learned last time according to the course requirement information through the classroom feedback of all the virtual students are recorded;
step S202-1: all the virtual students participate in classroom learning, and the virtual students give classroom feedback;
step S202-2: forming a student classroom feedback data sequence by the student classroom feedback given in the step S202-1, inputting the student classroom feedback data sequence into a first neural network model, and obtaining a knowledge point of the next teaching based on the output of the student classroom feedback data sequence, namely taking the knowledge point with the maximum recommendation degree as the knowledge point of the next teaching;
step S202-3: forming a student classroom feedback data sequence by the student classroom feedback given in the step S202-1, inputting the student classroom feedback data sequence into a second neural network model, and obtaining an evaluation value of the student classroom feedback data sequence based on the output of the student classroom feedback data sequence;
step S202-4: all the virtual students learn the next teaching knowledge point obtained based on the first neural network model;
step S3: judging whether a preset second maximum cycle number is reached, if so, ending; otherwise, the following processing is executed in a loop:
step S301: sampling the student classroom feedback data collected in the step S2;
step S302: calculating a first objective function based on the sampled data, and adjusting network parameters of a first neural network model according to a preset random gradient ascent algorithm, wherein the first objective function is used for representing the output loss of the first neural network model;
step S303: and calculating a second objective function based on the sampled data, and adjusting network parameters of the second neural network model according to a preset random gradient rise algorithm, wherein the second objective function is used for representing the output loss of the second neural network model.
8. The group instruction recommendation system of claim 7 wherein the first objective function is:
wherein ,
θk+1representing network parameters of the first neural network during the (k + 1) th training;
Dkindicating miningA data set of a sample;
t represents the course duration;
tau represents sampling data under a group of teaching paths;
πθ(at|st) When the network parameter is theta, the input student classroom feedback is s at the time ttThe output is atThe probability of (d);
the input parameters of the function clip () include rt(θ) and represents the boundary value ∈ if rt(theta) is less than or equal to 1-epsilon, and clip (), then, 1-epsilon; if rtIf (θ) is equal to or greater than 1 +. epsilon, clip (), 1 +. epsilon, and if 1-. epsilon < rt(theta) < 1+ ∈, then clip () < rt(θ); wherein,
and expressing the dominant value of the behavior at the moment t, and the calculation formula is as follows:
ξt=rt+γV(st+1)-V(st);
wherein ,ξtAn intermediate parameter representing the time t, gamma a preset discount factor, rtIndicating the prize value, V(s), earned at time tt) Representing a criticality value of the output of the second neural network model at time t;
the second objective functions are respectively:
wherein ,
9. The group education recommendation system of claim 1 wherein the recommendation and training process of the group education recommendation module is:
initializing a group recommendation model, and initializing by using the network parameters stored after the training of a pre-training module is finished;
after the teacher starts teaching, forming a student classroom feedback data sequence based on student classroom feedback of students in each class through a user terminal, and respectively inputting a first neural network model and a second neural network model of a group recommendation model; obtaining recommended teaching knowledge points of the next classroom and corresponding evaluation values based on the output of the recommended teaching knowledge points, and storing the student classroom feedback data sequence, the recommended teaching knowledge points and the evaluation values; and sending the recommended teaching knowledge points of the next classroom to the corresponding teachers;
and after the class, updating and training the group recommendation model based on historical data stored in the current updating period, wherein the historical data comprises a plurality of groups of data records, and each group of data comprises a student classroom feedback data sequence, a recommended teaching knowledge point and an evaluation value.
10. The group instruction recommendation system of claim 1 wherein the first and second neural network model hidden layers are long-short term memory recurrent neural networks.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210028554.7A CN114595923B (en) | 2022-01-11 | 2022-01-11 | Group teaching recommendation system based on deep reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210028554.7A CN114595923B (en) | 2022-01-11 | 2022-01-11 | Group teaching recommendation system based on deep reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114595923A true CN114595923A (en) | 2022-06-07 |
CN114595923B CN114595923B (en) | 2023-04-28 |
Family
ID=81803873
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210028554.7A Active CN114595923B (en) | 2022-01-11 | 2022-01-11 | Group teaching recommendation system based on deep reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114595923B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116521936A (en) * | 2023-06-30 | 2023-08-01 | 云南师范大学 | Course recommendation method and device based on user behavior analysis and storage medium |
CN117114937A (en) * | 2023-09-07 | 2023-11-24 | 深圳市真实智元科技有限公司 | Method and device for generating exercise song based on artificial intelligence |
CN117455389A (en) * | 2023-10-10 | 2024-01-26 | 北京华普亿方科技集团股份有限公司 | Vocational training management platform based on artificial intelligence |
CN117688248A (en) * | 2024-02-01 | 2024-03-12 | 安徽教育网络出版有限公司 | Online course recommendation method and system based on convolutional neural network |
CN117910481A (en) * | 2024-03-20 | 2024-04-19 | 北京语言大学 | Spoken language dialogue method and device for assisting language learning and dialogue robot |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108614865A (en) * | 2018-04-08 | 2018-10-02 | 暨南大学 | Method is recommended in individualized learning based on deeply study |
CN108615423A (en) * | 2018-06-21 | 2018-10-02 | 中山大学新华学院 | Instructional management system (IMS) on a kind of line based on deep learning |
CN109242207A (en) * | 2018-10-10 | 2019-01-18 | 中山大学 | A kind of Financial Time Series prediction technique based on deeply study |
EP3543918A1 (en) * | 2018-03-20 | 2019-09-25 | Flink AI GmbH | Reinforcement learning method |
US20210027178A1 (en) * | 2019-07-26 | 2021-01-28 | Ricoh Company, Ltd. | Recommendation method and recommendation apparatus based on deep reinforcement learning, and non-transitory computer-readable recording medium |
CN112700688A (en) * | 2020-12-25 | 2021-04-23 | 电子科技大学 | Intelligent classroom teaching auxiliary system |
CN112784154A (en) * | 2020-12-31 | 2021-05-11 | 电子科技大学 | Online teaching recommendation system with data enhancement |
KR20210124111A (en) * | 2021-03-25 | 2021-10-14 | 베이징 바이두 넷컴 사이언스 테크놀로지 컴퍼니 리미티드 | Method and apparatus for training model, device, medium and program product |
CN113509726A (en) * | 2021-04-16 | 2021-10-19 | 超参数科技(深圳)有限公司 | Interactive model training method and device, computer equipment and storage medium |
CN113590929A (en) * | 2021-01-28 | 2021-11-02 | 腾讯科技(深圳)有限公司 | Information recommendation method and device based on artificial intelligence and electronic equipment |
-
2022
- 2022-01-11 CN CN202210028554.7A patent/CN114595923B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3543918A1 (en) * | 2018-03-20 | 2019-09-25 | Flink AI GmbH | Reinforcement learning method |
CN108614865A (en) * | 2018-04-08 | 2018-10-02 | 暨南大学 | Method is recommended in individualized learning based on deeply study |
CN108615423A (en) * | 2018-06-21 | 2018-10-02 | 中山大学新华学院 | Instructional management system (IMS) on a kind of line based on deep learning |
CN109242207A (en) * | 2018-10-10 | 2019-01-18 | 中山大学 | A kind of Financial Time Series prediction technique based on deeply study |
US20210027178A1 (en) * | 2019-07-26 | 2021-01-28 | Ricoh Company, Ltd. | Recommendation method and recommendation apparatus based on deep reinforcement learning, and non-transitory computer-readable recording medium |
CN112700688A (en) * | 2020-12-25 | 2021-04-23 | 电子科技大学 | Intelligent classroom teaching auxiliary system |
CN112784154A (en) * | 2020-12-31 | 2021-05-11 | 电子科技大学 | Online teaching recommendation system with data enhancement |
CN113590929A (en) * | 2021-01-28 | 2021-11-02 | 腾讯科技(深圳)有限公司 | Information recommendation method and device based on artificial intelligence and electronic equipment |
KR20210124111A (en) * | 2021-03-25 | 2021-10-14 | 베이징 바이두 넷컴 사이언스 테크놀로지 컴퍼니 리미티드 | Method and apparatus for training model, device, medium and program product |
CN113509726A (en) * | 2021-04-16 | 2021-10-19 | 超参数科技(深圳)有限公司 | Interactive model training method and device, computer equipment and storage medium |
Non-Patent Citations (4)
Title |
---|
JULIAN IBARZ 等: "How to Train Your Robot with Deep Reinforcement Learning – Lessons We’ve Learned" * |
ZHENYA HUANG 等: "Exploring Multi-Obje ctive Exercise Re commendations in Online Education Systems" * |
杨腾杰: "一种基于知识点的智能教学辅助系统的设计和实现" * |
王雯: "基于深度强化学习的引导式习题推荐模型研究" * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116521936A (en) * | 2023-06-30 | 2023-08-01 | 云南师范大学 | Course recommendation method and device based on user behavior analysis and storage medium |
CN116521936B (en) * | 2023-06-30 | 2023-09-01 | 云南师范大学 | Course recommendation method and device based on user behavior analysis and storage medium |
CN117114937A (en) * | 2023-09-07 | 2023-11-24 | 深圳市真实智元科技有限公司 | Method and device for generating exercise song based on artificial intelligence |
CN117455389A (en) * | 2023-10-10 | 2024-01-26 | 北京华普亿方科技集团股份有限公司 | Vocational training management platform based on artificial intelligence |
CN117455389B (en) * | 2023-10-10 | 2024-05-28 | 北京华普亿方科技集团股份有限公司 | Vocational training management platform based on artificial intelligence |
CN117688248A (en) * | 2024-02-01 | 2024-03-12 | 安徽教育网络出版有限公司 | Online course recommendation method and system based on convolutional neural network |
CN117688248B (en) * | 2024-02-01 | 2024-04-26 | 安徽教育网络出版有限公司 | Online course recommendation method and system based on convolutional neural network |
CN117910481A (en) * | 2024-03-20 | 2024-04-19 | 北京语言大学 | Spoken language dialogue method and device for assisting language learning and dialogue robot |
Also Published As
Publication number | Publication date |
---|---|
CN114595923B (en) | 2023-04-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114595923B (en) | Group teaching recommendation system based on deep reinforcement learning | |
Ramírez-Noriega et al. | Evaluation module based on Bayesian networks to Intelligent Tutoring Systems | |
Hinostroza et al. | Pedagogy embedded in educational software design: report of a case study | |
Pawlak et al. | Learning assistant approaches to teaching computational physics problems in a problem-based learning course | |
Noh et al. | Intelligent tutoring system using rule-based and case-based: a comparison | |
Terras | Transforming the teacher: Examining personal transformations of faculty redesigning courses from face-to-face to online | |
CN113361791A (en) | Student score prediction method based on graph convolution | |
Chan et al. | Applying the genetic encoded conceptual graph to grouping learning | |
Olmstead et al. | Assessing the interactivity and prescriptiveness of faculty professional development workshops: The real-time professional development observation tool | |
Hofstein et al. | Teaching and learning in the school chemistry laboratory | |
CN112951022A (en) | Multimedia interactive education training system | |
Lederman et al. | Systematic assessment of communication games and simulations: An applied framework | |
Zulueta et al. | Scenario-based microlearning strategy for improved basic science process skills in self-directed learning | |
Elloumi et al. | Exploring requirements and opportunities for social robots in primary mathematics education | |
CN115205072A (en) | Cognitive diagnosis method for long-period evaluation | |
CN114997461A (en) | Time-sensitive answer correctness prediction method combining learning and forgetting | |
Hernández et al. | A probabilistic model of affective behavior for Intelligent Tutoring Systems | |
Hare et al. | Evaluation of a Game-Based Personalized Learning System | |
Kamha et al. | Implementation of a Curriculum to Enhance Learning Management Competency in Computational Thinking for the Lower Secondary Teachers | |
Zou et al. | A novel learning early-warning model based on knowledge points and question types | |
Wiggins et al. | Acquiring professional skills: Virtual facilitator as model for team communication | |
CN116777402B (en) | Personalized role positioning method for group collaboration based on meta cognition | |
Dang | Exploring a hybrid online and offline English teaching model based on model hierarchy analysis | |
Juniati et al. | Examining Prospective Teachers' Belief and Pedagogical Content Knowledge towards Teaching Practice in Mathematics Class: A Case Study. | |
Kemouss et al. | Towards the Process of Adapting the Concrete and the Abstract Through Learning Activities According to Kolb's Styles in Online Teaching |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |