CN115984956A - Man-machine cooperation student classroom attendance multi-mode visual analysis system - Google Patents

Man-machine cooperation student classroom attendance multi-mode visual analysis system Download PDF

Info

Publication number
CN115984956A
CN115984956A CN202211621966.8A CN202211621966A CN115984956A CN 115984956 A CN115984956 A CN 115984956A CN 202211621966 A CN202211621966 A CN 202211621966A CN 115984956 A CN115984956 A CN 115984956A
Authority
CN
China
Prior art keywords
classroom
module
learning
analysis module
activity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211621966.8A
Other languages
Chinese (zh)
Other versions
CN115984956B (en
Inventor
蒋艳双
祁彬斌
包昊罡
黄荣怀
刘德建
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Normal University
Original Assignee
Beijing Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Normal University filed Critical Beijing Normal University
Priority to CN202211621966.8A priority Critical patent/CN115984956B/en
Publication of CN115984956A publication Critical patent/CN115984956A/en
Application granted granted Critical
Publication of CN115984956B publication Critical patent/CN115984956B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention discloses a man-machine cooperative student class attendance multi-mode visual analysis system which comprises a multi-mode data acquisition module, a learning behavior analysis module, a teaching activity fusion analysis module, a class attendance analysis module and a visual feedback module which are sequentially connected, wherein the learning behavior analysis module, the teaching activity fusion analysis module, the class attendance analysis module and an education field knowledge extraction module are respectively adjacent. According to the man-machine cooperative student class input degree multi-mode visual analysis system with the structure, the student modal data are collected and analyzed, and the learning input degree of students under different scenes is comprehensively evaluated and visually fed back according to the established classification standard and index related to class input.

Description

Man-machine cooperation student classroom attendance multi-mode visual analysis system
Technical Field
The invention relates to the technical field of intelligent teaching, in particular to a man-machine cooperation student classroom attendance multi-mode visual analysis system.
Background
Student classroom attendance analysis is an important foundation in the field of education measurement and learning analysis. Existing analytical techniques mainly include two categories: structured scale based observational rating techniques and objective behavioral metric based coding analysis techniques. The observation and evaluation technology takes a series of evaluation items which are put into classes of students as a measurement tool basis, but the evaluation items often have unstable analysis standards due to the attention and experience difference of observers, have strong subjectivity, do not have an automatic flow basis developed in a large scale, and have low measurement reliability, high analysis cost and single result presentation mode. The coding analysis technology utilizes an artificial behavior coding system or an automatic behavior coding system based on a computer vision technology to dissociate classroom scenes into a discrete student behavior sequence, generally takes the proportion of specific first-order or multi-order (conversion) behaviors in the sequence as an objective measurement index as an input degree analysis standard, has strong objectivity for supporting large-scale development, but is limited by single explicit behaviors and lacks of experience participation of people, and has low measurement efficiency, insufficient interpretability and single data modal source.
Disclosure of Invention
The invention aims to provide a man-machine cooperative student classroom input degree multi-mode visual analysis system, which comprehensively evaluates and visually feeds back the learning input degree of students in different scenes according to the established classification standard and index related to classroom input by acquiring and analyzing student modal data.
In order to achieve the purpose, the invention provides a man-machine cooperative student classroom attendance multi-mode visual analysis system which comprises a multi-mode data acquisition module, a learning behavior analysis module, a teaching activity fusion analysis module, a classroom attendance analysis module and a visual feedback module which are sequentially connected, wherein the learning behavior analysis module, the teaching activity fusion analysis module and the classroom attendance analysis module are respectively adjacent to an education field knowledge extraction module;
the system comprises a multi-mode data acquisition module, a multi-mode data processing module and a multi-mode data processing module, wherein the multi-mode data acquisition module is used for acquiring original multi-mode data generated in the classroom process, and the original multi-mode data comprises classroom two-dimensional video data, classroom depth video data and classroom audio data;
the learning behavior analysis module calculates and preliminarily analyzes the learning behaviors of the students in real time based on the multi-mode data source, and specifically embodies that the modal information of expressions, actions and languages of the students in a classroom is identified through an artificial intelligence algorithm;
the teaching activity fusion analysis module is used for generating higher-level activity information based on the behavior information analysis of students, and is specifically embodied in that each modal information is cooperatively expressed as matched learning activity by a multi-modal machine learning method;
the classroom investment analysis module is used for analyzing and calculating the investment of individual students in a specific scene in a combined objective learning activity information manner, wherein the specific scene is a background category of occurrence of teaching activities and comprises teaching, practice and discussion, the specific index dimension of the investment analysis and calculation comes from the education field knowledge extraction module, the original value of the specific index dimension is calculated by multiplying a scene row matrix of m columns, a weight matrix of m rows and n columns and an activity column matrix of n rows, the weight matrix comes from the education field knowledge extraction module, and the standard value of the index is calculated by a zero-mean standardization method on the basis of the original value;
the education field knowledge extraction module is used for inquiring and combining expert opinions to form theoretical dimensions and indexes of each level related to classroom input, and the theoretical dimensions and indexes of each level comprise: the learning method comprises the following steps of (1) learning behavior classification standards related to classroom input, learning activity classification standards related to classroom input, teaching scene classification standards related to classroom input, classroom input measurement dimension and index, and weight matrixes of each learning activity corresponding to each measurement index in each teaching scene;
and the visual feedback module is used for visually outputting behaviors and activity recognition results related to classroom investment and evaluation index calculation results, calculating the learning investment index scores of students in various scenes in the classroom process, outputting the evaluation index calculation results as an investment degree change curve, and outputting the visual output mode in a video and image output mode.
Preferably, the multi-mode data acquisition module is composed of 2 4K cameras and 1 depth camera, the two 4K cameras are respectively arranged at the upper left corner and the upper right corner of a blackboard of a classroom, the depth cameras are arranged at the center of the upper edge of the blackboard, the two 4K cameras respectively shoot students at the left half side and the right half side in the classroom, and the depth camera at the center shoots all the students in the classroom forwards.
Preferably, in the learning behavior analysis module, the implementation method for identifying modal information of students in a classroom through an artificial intelligence algorithm comprises the following steps:
1) Carrying out confidence threshold adjustment through an artificial intelligence algorithm, identifying all visible teacher and student entities by using a computer vision technology, and detecting the position, the category and the confidence of the entities in the two-dimensional picture;
2) Combining two-dimensional entity position information and entity depth information, and performing entity-label mapping through a dynamic tracking algorithm taking minimum interframe entity position offset as an optimization target, wherein the optimization target of the dynamic tracking algorithm is the sum of minimum offsets of Euclidean distances of all entities in adjacent frames in a three-dimensional space;
3) Extracting and aligning language information through a voice recognition algorithm and a Chinese word vector algorithm, and converting unstructured speech information into a 300-dimensional structured vector by applying a Chinese word vector pre-training model based on a public corpus;
4) And identifying the expression and action states of the teachers and students in each frame by using an expression and action identification model trained on the public large-scale data set and on the basis of the action classification coding standard approved in the education field knowledge extraction module.
Preferably, in the teaching activity fusion analysis module, the implementation steps of the multi-modal machine learning method include:
1) Mapping the expression, action and language modal information of the student entity to the same feature space x;
2) Training a classification model from modal information such as expressions, actions, languages and the like to learning activities based on a learning activity classification coding standard approved in a knowledge extraction module in the education field, and performing automatic student entity activity matching coding.
Preferably, in the education domain knowledge extraction module, the steps of forming the theoretical dimensions and indexes of each level related to the classroom investment are as follows:
1) The learning behavior classification coding module is connected with a learning behavior analysis module, a learning behavior classification coding standard is formulated, and behaviors and expressions which are highly related to classroom teaching behaviors are screened out according to related actions and expressions which can be recognized by a current computer, wherein the behaviors and expressions include head gestures, limb actions, expressions, speech and interpersonal interaction actions;
2) The system is in butt joint with a teaching activity fusion analysis module, a teaching activity classification coding standard is formulated, 13 activity states of listening and speaking, hands-on experiment/practice, note taking, exercise making, computer/PAD operation, hands raising, standing up, reading, conversation with a teacher, feedback of the teacher, companion discussion, hands-on cooperation and classroom separation are coded and explained respectively, and a teaching activity automatic analysis coding table is constructed;
3) The system is in butt joint with a classroom investment degree analysis module, a teaching scene classification code standard is formulated, and corresponding scene codes and scene categories under different scene descriptions are formulated;
4) The evaluation index dimension related to the classroom investment is examined and determined by being connected with a classroom investment degree analysis module;
5) And the system is in butt joint with a classroom investment degree analysis module, and subjectively and jointly determines a weight matrix from scenes, activities to each investment evaluation index.
Therefore, the man-machine cooperation student classroom attendance multi-mode visual analysis system with the structure realizes interpretability by disassembling the model calculation process and introducing the domain knowledge, and further directly explains teacher and student behaviors and actions in education through the analysis result of the general scene; the learning behavior and the learning input are analyzed by adopting a non-invasive multi-mode data acquisition and analysis technology, so that the problems that the single-mode information quantity is insufficient and is easily influenced by external factors are solved, and the method has an important value for improving the accuracy of collaborative learning input analysis; and (3) introducing the scenes into an analysis process, exploring the change rule of the learning input in each scene, and representing different learning variables of behaviors, languages, learning input degrees and the like of students in different scenes.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
FIG. 1 is a schematic structural diagram of a man-machine cooperative student classroom investment multi-modal visual analysis system according to the present invention;
FIG. 2 is a schematic diagram of the structural distribution of the multi-modal data collection modules according to the embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating a process of model training by XGboost algorithm in a teaching activity fusion analysis module according to an embodiment of the present invention;
FIG. 4 is a classroom live analysis video output by a visual feedback module in accordance with an embodiment of the present invention;
fig. 5 is a graph illustrating the variation of the input level output by the feedback module according to the embodiment of the present invention.
Reference numerals
1. A blackboard; 2. a 4K camera; 3. a depth camera; m1, a multi-mode data acquisition module; m2, a learning behavior analysis module; m3, a teaching activity fusion analysis module; m4, a classroom investment degree analysis module; m5, an education domain knowledge extraction module; m6, a visual feedback module.
Detailed Description
The technical scheme of the invention is further explained by the attached drawings and the embodiment.
Examples
As shown in fig. 1, a man-machine cooperation student classroom investment multi-modal visual analysis system comprises a multi-modal data acquisition module M1, a learning behavior analysis module M2, a teaching activity fusion analysis module M3, a classroom investment analysis module M4 and a visual feedback module M6 which are connected in sequence, wherein the learning behavior analysis module M2, the teaching activity fusion analysis module M3, the classroom investment analysis module M4 and an education field knowledge extraction module M5 are respectively adjacent;
the multi-mode data acquisition module M1 is used for acquiring original multi-mode data generated in a classroom process, wherein the original multi-mode data comprises classroom two-dimensional video data, classroom depth video data and classroom audio data; the multimode data acquisition module M1 is composed of 2 4K cameras 2 and 1 depth camera 3, as shown in figure 2, two 4K cameras 2 are respectively arranged at the upper left corner and the upper right corner of a blackboard 1 of a classroom, the depth camera 3 is arranged at the center of the upper edge of the blackboard 1, the two 4K cameras 2 respectively shoot students at the left half side and the right half side in the classroom, and the depth camera 3 at the center shoots all students in the classroom forwards.
The learning behavior analysis module M2 calculates and preliminarily analyzes the learning behaviors of students in real time based on a multi-mode data source, and specifically embodies the mode information of expressions, actions and languages of the students in a classroom is identified through an artificial intelligence algorithm, and the implementation method comprises the following steps:
1) And carrying out confidence threshold adjustment through an artificial intelligence algorithm, and identifying all visible teacher and student entities by using a computer vision technology. Entity identification is carried out by adopting a Yolo-v5 algorithm, wherein the Yolo-v5 algorithm is an open source target detection network algorithm, and the position, the category and the confidence coefficient of an entity in a two-dimensional picture can be detected according to a GPL-3.0 open source protocol;
2) And combining the two-dimensional entity position information and the entity depth information, and performing entity-label mapping by using a dynamic tracking algorithm taking the minimum interframe entity position offset as an optimization target, wherein the optimization target of the dynamic tracking algorithm is the sum of minimum offsets of Euclidean distances of all entities in adjacent frames in a three-dimensional space (x, y, z). The dynamic tracking algorithm can set a maximum offset threshold L, when the offset of an adjacent frame of a single entity is obviously greater than L, the entity is judged to be abnormal, and the frame is skipped;
Figure SMS_1
3) Extracting and aligning language information through a voice recognition algorithm and a Chinese word vector algorithm, and converting an unstructured speech information into a 300-dimensional structured vector by applying a Chinese word vector pre-training model based on a public corpus;
4) And identifying the expression and action states of the teachers and students in each frame by using an expression and action identification model trained on the public large-scale data set and based on the action classification coding standard approved in the education field knowledge extraction module M5. The expression and action recognition is carried out by adopting VGGNet16 and Slowfast algorithm, and the Slowfast is an open-source video understanding network algorithm and follows Apache-2.0 open-source protocol.
The teaching activity fusion analysis module M3 is used for generating higher-level activity information based on the behavior information analysis of students, and specifically representing each modal information as matched learning activity in a collaborative way by a multi-modal machine learning method. The implementation steps of the multi-modal machine learning method comprise:
1) Mapping the expression, action and language modal information of the student entity to the same feature space x;
2) Based on the learning activity classification coding standard examined in the education domain knowledge extraction module M5, a classification model from modal information such as expressions, actions and languages to learning activities is trained, and automatic student entity activity matching coding is carried out. The XGBoost algorithm is used here to implement the activity matching process.
As shown in fig. 3, the essence of the algorithm lies in that feature splitting is continuously performed to grow a decision tree, and each round of learning a tree is used to fit the residual between the predicted value and the actual value of the model in the previous round, and the minimization of the objective function is realized through second-order taylor expansion. Wherein the objective function is:
Figure SMS_2
wherein the square loss function of the actual value and the predicted value is:
Figure SMS_3
the regularization function is (where T refers to the number of leaves in the decision tree, and to the L2 modulo square of the predicted value of the decision tree):
Figure SMS_4
when the model training is finished to obtain k decision trees, if the score of a sample is to be predicted, a corresponding leaf node is fallen in each tree according to the characteristics of the sample, each leaf node corresponds to a corresponding score, and finally the scores corresponding to each tree are added to obtain the predicted value of the sample.
The classroom investment degree analysis module M4 is used for analyzing and calculating the investment degree of students in a specific scene in a combined objective learning activity information manner, wherein the specific scene is a background category of occurrence of teaching activities, such as teaching, practice, discussion and the like, specific index dimensionality of the investment degree analysis and calculation comes from the education field knowledge extraction module M5, the original value of the investment degree analysis and calculation is obtained by multiplying a scene row matrix of M columns, a weight matrix of M rows and n columns and an activity column matrix of n rows, the weight matrix is derived from the education field knowledge extraction module M5, and the standard value of the index is calculated by a z-score (zero-mean standardization) method on the basis of the original value.
Figure SMS_5
The formula aims to normalize the raw data set to a mean 0 and variance 1 data set, where μ and σ are the mean and variance, respectively, of the raw data set.
The education field knowledge extraction module M5 is used for consulting and combining expert opinions to form theoretical dimensions and indexes of each level related to classroom investment, and the theoretical dimensions and indexes of each level comprise: the method comprises the following steps of classifying learning behavior relevant to class input, classifying learning activity relevant to class input, classifying teaching scene relevant to class input, measuring dimension and index of class input, and weighting matrix of each learning activity corresponding to each measuring index in each teaching scene, and comprises the following steps:
1) And the learning behavior analysis module M2 is connected with the computer to make a learning behavior classification coding standard. According to the related actions and expressions which can be identified by the current computer, 59 items of actions and expressions which are highly related to classroom teaching actions are screened out through the discussion of educational experts, wherein the actions and expressions comprise 6 items of head gestures, 31 items of limb actions, 7 expressions, 2 classes of words and 13 items of man-machine interaction actions, and the items are shown in the table 1.
Figure SMS_6
TABLE 1 actions and expressions associated with classroom teaching actions
2) And the teaching activity fusion analysis module M3 is in butt joint to make a teaching activity classification coding standard. An automatic analysis coding table of teaching activities is constructed by the discussion of education experts on the basis of the analysis indexes of the existing classroom teaching behaviors such as a Frands coding system, an S-T code and the like, and is shown in a table 2.
Figure SMS_7
TABLE 2 teaching activities automatic analysis coding table
3) And the system is connected with a classroom investment degree analysis module M4 in a butt joint mode, a teaching scene classification code standard is formulated, and corresponding scene codes and scene categories under different scene descriptions are formulated, as shown in a table 3.
Figure SMS_8
TABLE 3 teaching scene Classification coding Standard
4) And (4) docking with a classroom investment degree analysis module M4, and examining evaluation index dimensions related to classroom investment, as shown in Table 4.
Figure SMS_9
TABLE 4 evaluation index dimensionality related to classroom investment
5) And the system is in butt joint with a classroom investment degree analysis module M4, and subjectively and jointly determines a weight matrix from scenes, activities to each investment evaluation index. This step can be performed by the Delphi Method. The Delphi method is a method for obtaining expert consensus for a specific subject content, firstly, selecting 10-30 expert group members with professional representativeness and authority, then determining m x n matrix W of n types of learning activities corresponding to calculation index k under m teaching scenes through two rounds of index inquiry and one round of weight determination k The value range of the elements in the matrix is between 0 and 1.
Figure SMS_10
And the visual feedback module M6 is used for visually outputting behaviors, activity recognition results and evaluation index calculation results related to classroom input. The output mode of visual output is video and image output, and comprises the following steps:
1) Align multimodal data source information, dynamic tracking information, character behavior information, activity matching information, output as a live classroom analysis video, as shown in fig. 4.
2) The learning input index score of each student in each scene in the classroom process is calculated and output as an input degree change curve, as shown in fig. 5.
The above system configuration is configured in a computer device including a memory, a processor, a display adapter, a communication interface, and a communication bus, a computer program executable on the processor is stored in the memory, and the steps in the above embodiments are realized when the processor executes the computer program.
Therefore, the man-machine cooperation student classroom attendance multi-mode visual analysis system adopting the structure has the following beneficial effects:
1) By means of man-machine cooperation, the interpretability of analysis is improved by integrating the field knowledge. In order to support the interpretability of the calculation process, the analysis framework introduces the experience of experts in the education field at each key node, and the Delphi method is adopted to consult the education experts, so that a coding table of basic actions, teaching activities, teaching scenes and input states is obtained, and a knowledge base is provided for interpretable calculation.
2) The teaching activities are used as hinges to enhance the universality of the framework. The teaching activities are the basis and key link for classroom observation and analysis. On the basis of a relevant education theory, an analysis framework takes teaching activities as a bridge and communicates the recognizable bottom-layer characteristics of a computer with the high-layer semantics of education. The analysis of teaching behaviors is realized by an end-to-end method, and then the high-level semantic concepts such as learning investment and the like are calculated by adopting an expert weighting mode, so that the automatic analysis of the whole process is realized.
3) The multi-modal analysis of the whole process is automatically integrated with the scene. By adopting a multi-modal data acquisition, analysis and fusion method, the comprehensive analysis of the multi-modal teaching behaviors of teachers and students is realized from four aspects of language, action, expression and head posture. Meanwhile, the teaching scenes are automatically identified on the basis of teaching behaviors, so that the scene calculation of learning investment is realized.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the invention without departing from the spirit and scope of the invention.

Claims (5)

1. A man-machine cooperation student classroom attendance multi-mode visual analysis system is characterized in that: the system comprises a multi-mode data acquisition module, a learning behavior analysis module, a teaching activity fusion analysis module, a classroom entrance degree analysis module and a visual feedback module which are sequentially connected, wherein the learning behavior analysis module, the teaching activity fusion analysis module and the classroom entrance degree analysis module are respectively adjacent to an education field knowledge extraction module;
the system comprises a multi-mode data acquisition module, a multi-mode data processing module and a multi-mode data processing module, wherein the multi-mode data acquisition module is used for acquiring original multi-mode data generated in the classroom process, and the original multi-mode data comprises classroom two-dimensional video data, classroom depth video data and classroom audio data;
the learning behavior analysis module calculates and preliminarily analyzes the learning behaviors of the students in real time based on the multi-mode data source, and specifically embodies that the modal information of expressions, actions and languages of the students in a classroom is identified through an artificial intelligence algorithm;
the teaching activity fusion analysis module is used for generating higher-level activity information based on the behavior information analysis of students, and is specifically embodied in that each modal information is cooperatively expressed as matched learning activity by a multi-modal machine learning method;
the classroom investment analysis module is used for analyzing and calculating the investment of individual students in a specific scene in a combined objective learning activity information manner, wherein the specific scene is a background category of occurrence of teaching activities and comprises teaching, practice and discussion, the specific index dimension of the investment analysis and calculation comes from the education field knowledge extraction module, the original value of the specific index dimension is calculated by multiplying a scene row matrix of m columns, a weight matrix of m rows and n columns and an activity column matrix of n rows, the weight matrix comes from the education field knowledge extraction module, and the standard value of the index is calculated by a zero-mean standardization method on the basis of the original value;
the education field knowledge extraction module is used for consulting and combining expert opinions to form theoretical dimensions and indexes of each level related to classroom input, and the theoretical dimensions and indexes of each level comprise: the learning method comprises the following steps of (1) learning behavior classification standards related to classroom input, learning activity classification standards related to classroom input, teaching scene classification standards related to classroom input, classroom input measurement dimension and index, and weight matrixes of each learning activity corresponding to each measurement index in each teaching scene;
and the visual feedback module is used for visually outputting behaviors, activity recognition results and evaluation index calculation results related to classroom investment, calculating the learning investment index score of each student in each scene in the classroom process, outputting the evaluation index calculation results as an investment degree change curve, and outputting the visual output in a video and image output mode.
2. The human-computer cooperative student classroom attendance multi-modal visual analysis system according to claim 1, wherein: the multimode data acquisition module is composed of 2 4K cameras and 1 depth camera, wherein the two 4K cameras are respectively arranged at the upper left corner and the upper right corner of a blackboard of a classroom, the depth cameras are arranged at the center of the upper edge of the blackboard, the two 4K cameras respectively shoot students at the left half side and the right half side in the classroom, and the depth camera at the center shoots all the students in the classroom forwards.
3. The human-computer cooperative student classroom attendance multi-modal visual analysis system according to claim 1, wherein: in the learning behavior analysis module, the implementation method for identifying the modal information of students in a classroom through an artificial intelligence algorithm comprises the following steps:
1) Carrying out confidence threshold adjustment through an artificial intelligence algorithm, identifying all visible teacher and student entities by using a computer vision technology, and detecting the position, category and confidence of the entities in the two-dimensional picture;
2) Combining two-dimensional entity position information and entity depth information, and performing entity-label mapping through a dynamic tracking algorithm taking minimum interframe entity position offset as an optimization target, wherein the optimization target of the dynamic tracking algorithm is the sum of minimum offsets of Euclidean distances of all entities in adjacent frames in a three-dimensional space;
3) Extracting and aligning language information through a voice recognition algorithm and a Chinese word vector algorithm, and converting unstructured speech information into a 300-dimensional structured vector by applying a Chinese word vector pre-training model based on a public corpus;
4) And identifying the expression and action states of the teachers and students in each frame by using an expression and action identification model trained on the public large-scale data set and on the basis of the action classification coding standard approved in the education field knowledge extraction module.
4. The human-computer cooperative student classroom attendance multi-modal visual analysis system according to claim 1, wherein: in the teaching activity fusion analysis module, the implementation steps of the multi-mode machine learning method comprise:
1) Mapping the expression, action and language modal information of the student entity to the same feature space x;
2) Training a classification model from modal information such as expressions, actions, languages and the like to learning activities based on the learning activity classification coding standard examined in the education field knowledge extraction module, and carrying out automatic student entity activity matching coding.
5. The human-computer cooperative student classroom attendance multi-modal visual analysis system according to claim 1, wherein: in the education field knowledge extraction module, the steps of forming each level of theoretical dimension and index related to classroom input are as follows:
1) The system is connected with a learning behavior analysis module, a learning behavior classification coding standard is formulated, and behaviors and expressions which are highly related to classroom teaching behaviors are screened out according to related actions and expressions which can be identified by a current computer, wherein the behaviors and expressions comprise head gestures, limb actions, expressions, speech and interpersonal interaction actions;
2) The system is in butt joint with a teaching activity fusion analysis module, a teaching activity classification coding standard is formulated, 13 activity states of listening, manual experiment/practice, note taking, exercise doing, computer/PAD operation, hand lifting, standing up, reading, conversation with a teacher, feedback to the teacher, partner discussion, manual cooperation and classroom separation are coded and explained respectively, and a teaching activity automatic analysis coding table is constructed;
3) The system is in butt joint with a classroom investment degree analysis module, a teaching scene classification code standard is formulated, and corresponding scene codes and scene categories under different scene descriptions are formulated;
4) The system is in butt joint with a classroom investment degree analysis module, and the evaluation index dimensionality related to classroom investment is examined;
5) And the system is in butt joint with a classroom investment analysis module, and subjectively and objectively jointly determines a weight matrix from scenes and activities to each investment evaluation index.
CN202211621966.8A 2022-12-16 2022-12-16 Multi-mode visual analysis system for class investment of students through man-machine cooperation Active CN115984956B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211621966.8A CN115984956B (en) 2022-12-16 2022-12-16 Multi-mode visual analysis system for class investment of students through man-machine cooperation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211621966.8A CN115984956B (en) 2022-12-16 2022-12-16 Multi-mode visual analysis system for class investment of students through man-machine cooperation

Publications (2)

Publication Number Publication Date
CN115984956A true CN115984956A (en) 2023-04-18
CN115984956B CN115984956B (en) 2023-08-29

Family

ID=85973230

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211621966.8A Active CN115984956B (en) 2022-12-16 2022-12-16 Multi-mode visual analysis system for class investment of students through man-machine cooperation

Country Status (1)

Country Link
CN (1) CN115984956B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117351575A (en) * 2023-12-05 2024-01-05 北京师范大学珠海校区 Nonverbal behavior recognition method and nonverbal behavior recognition device based on text-generated graph data enhancement model

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100291528A1 (en) * 2009-05-12 2010-11-18 International Business Machines Corporation Method and system for improving the quality of teaching through analysis using a virtual teaching device
CN108805009A (en) * 2018-04-20 2018-11-13 华中师范大学 Classroom learning state monitoring method based on multimodal information fusion and system
US20180366013A1 (en) * 2014-08-28 2018-12-20 Ideaphora India Private Limited System and method for providing an interactive visual learning environment for creation, presentation, sharing, organizing and analysis of knowledge on subject matter
CN109697577A (en) * 2019-02-01 2019-04-30 北京清帆科技有限公司 A kind of voice-based Classroom instruction quality evaluation method
CN111275760A (en) * 2020-01-16 2020-06-12 上海工程技术大学 Unmanned aerial vehicle target tracking system and method based on 5G and depth image information
CN114708525A (en) * 2022-03-04 2022-07-05 河北工程大学 Deep learning-based student classroom behavior identification method and system
CN115146975A (en) * 2022-07-08 2022-10-04 华中师范大学 Teacher-machine-student oriented teaching effect evaluation method and system based on deep learning
CN115239527A (en) * 2022-06-27 2022-10-25 重庆市科学技术研究院 Teaching behavior analysis system for teaching characteristic fusion and modeling based on knowledge base

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100291528A1 (en) * 2009-05-12 2010-11-18 International Business Machines Corporation Method and system for improving the quality of teaching through analysis using a virtual teaching device
US20180366013A1 (en) * 2014-08-28 2018-12-20 Ideaphora India Private Limited System and method for providing an interactive visual learning environment for creation, presentation, sharing, organizing and analysis of knowledge on subject matter
CN108805009A (en) * 2018-04-20 2018-11-13 华中师范大学 Classroom learning state monitoring method based on multimodal information fusion and system
CN109697577A (en) * 2019-02-01 2019-04-30 北京清帆科技有限公司 A kind of voice-based Classroom instruction quality evaluation method
CN111275760A (en) * 2020-01-16 2020-06-12 上海工程技术大学 Unmanned aerial vehicle target tracking system and method based on 5G and depth image information
CN114708525A (en) * 2022-03-04 2022-07-05 河北工程大学 Deep learning-based student classroom behavior identification method and system
CN115239527A (en) * 2022-06-27 2022-10-25 重庆市科学技术研究院 Teaching behavior analysis system for teaching characteristic fusion and modeling based on knowledge base
CN115146975A (en) * 2022-07-08 2022-10-04 华中师范大学 Teacher-machine-student oriented teaching effect evaluation method and system based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
赵杰等: "《智能机器人技术:安保、巡逻、处置类警用机器人研究实践》", 机械工业出版社, pages: 266 - 268 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117351575A (en) * 2023-12-05 2024-01-05 北京师范大学珠海校区 Nonverbal behavior recognition method and nonverbal behavior recognition device based on text-generated graph data enhancement model
CN117351575B (en) * 2023-12-05 2024-02-27 北京师范大学珠海校区 Nonverbal behavior recognition method and nonverbal behavior recognition device based on text-generated graph data enhancement model

Also Published As

Publication number Publication date
CN115984956B (en) 2023-08-29

Similar Documents

Publication Publication Date Title
CN106503055B (en) A kind of generation method from structured text to iamge description
CN110992741A (en) Learning auxiliary method and system based on classroom emotion and behavior analysis
CN111915148B (en) Classroom teaching evaluation method and system based on information technology
CN111931585A (en) Classroom concentration degree detection method and device
CN112069970B (en) Classroom teaching event analysis method and device
CN105468468A (en) Data error correction method and apparatus facing question answering system
CN107301164B (en) Semantic analysis method and device for mathematical formula
CN109598226B (en) Online examination cheating judgment method based on Kinect color and depth information
CN115146162A (en) Online course recommendation method and system
CN112232276B (en) Emotion detection method and device based on voice recognition and image recognition
CN111524578A (en) Psychological assessment device, method and system based on electronic psychological sand table
CN115984956B (en) Multi-mode visual analysis system for class investment of students through man-machine cooperation
CN107578015B (en) First impression recognition and feedback system and method based on deep learning
CN110245253A (en) A kind of Semantic interaction method and system based on environmental information
CN115719516A (en) Multichannel-based classroom teaching behavior identification method and system
CN116050892A (en) Intelligent education evaluation supervision method based on artificial intelligence
CN110852071B (en) Knowledge point detection method, device, equipment and readable storage medium
KR20180058298A (en) System and method for testing a school readiness of the school-age child
CN114186983B (en) Video interview multidimensional scoring method, system, computer equipment and storage medium
CN115810163B (en) Teaching evaluation method and system based on AI classroom behavior recognition
CN110956142A (en) Intelligent interactive training system
CN111950472A (en) Teacher grinding evaluation method and system
CN117455126B (en) Ubiquitous practical training teaching and evaluation management system and method
CN116226410B (en) Teaching evaluation and feedback method and system for knowledge element connection learner state
CN115455247B (en) Classroom collaborative learning role judgment method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant