CN115984956B - Multi-mode visual analysis system for class investment of students through man-machine cooperation - Google Patents

Multi-mode visual analysis system for class investment of students through man-machine cooperation Download PDF

Info

Publication number
CN115984956B
CN115984956B CN202211621966.8A CN202211621966A CN115984956B CN 115984956 B CN115984956 B CN 115984956B CN 202211621966 A CN202211621966 A CN 202211621966A CN 115984956 B CN115984956 B CN 115984956B
Authority
CN
China
Prior art keywords
classroom
input
learning
analysis module
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211621966.8A
Other languages
Chinese (zh)
Other versions
CN115984956A (en
Inventor
蒋艳双
祁彬斌
包昊罡
黄荣怀
刘德建
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Normal University
Original Assignee
Beijing Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Normal University filed Critical Beijing Normal University
Priority to CN202211621966.8A priority Critical patent/CN115984956B/en
Publication of CN115984956A publication Critical patent/CN115984956A/en
Application granted granted Critical
Publication of CN115984956B publication Critical patent/CN115984956B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention discloses a man-machine collaborative student classroom input multi-mode visual analysis system which comprises a multi-mode data acquisition module, a learning behavior analysis module, a teaching activity fusion analysis module, a classroom input analysis module and a visual feedback module which are connected in sequence, wherein the learning behavior analysis module, the teaching activity fusion analysis module, the classroom input analysis module and the education field knowledge extraction module are respectively adjacent. According to the man-machine collaborative student classroom input degree multi-mode visual analysis system with the structure, through acquisition and analysis of student modal data, comprehensive evaluation and visual feedback are carried out on the learning input degree of students in different scenes according to the established class input related classification standards and indexes.

Description

Multi-mode visual analysis system for class investment of students through man-machine cooperation
Technical Field
The invention relates to the technical field of intelligent teaching, in particular to a man-machine collaborative student class input multi-mode visual analysis system.
Background
The class input analysis of students is an important basis in the fields of education measurement and learning analysis. Existing analytical techniques mainly include two classes: observation and evaluation technology based on a structuring scale and coding analysis technology based on objective behavior measurement. The observation and evaluation technology takes a series of evaluation items pointing to the class investment of students as a measuring tool basis, but the observation and evaluation technology often has unstable analysis standards due to the attention and experience difference of observers, has stronger subjectivity, does not have an automation flow basis for large-scale development, and has low measurement reliability, high analysis cost and single result presentation mode. The coding analysis technology utilizes an artificial behavior coding system or an automatic behavior coding system based on a computer vision technology to dissociate classroom live into discrete student behavior sequences, generally takes the proportion of specific first-order or multi-order (conversion) behaviors in the sequences as objective measurement indexes as input degree analysis standards, has strong objectivity supporting large-scale development, but has low measurement efficiency, insufficient interpretability and single data modal source because the analysis standards are limited by single explicit behaviors and lack of experience participation of people.
Disclosure of Invention
The invention aims to provide a man-machine collaborative student classroom input degree multi-mode visual analysis system, which is used for comprehensively evaluating and visually feeding back the learning input degree of students in different scenes according to established class standards and indexes related to the classroom input through acquisition and analysis of student mode data.
In order to achieve the aim, the invention provides a man-machine collaborative student classroom input multi-mode visual analysis system, which comprises a multi-mode data acquisition module, a learning behavior analysis module, a teaching activity fusion analysis module, a classroom input analysis module and a visual feedback module which are connected in sequence, wherein the learning behavior analysis module, the teaching activity fusion analysis module, the classroom input analysis module and the education field knowledge extraction module are respectively adjacent;
the multi-mode data acquisition module is used for acquiring original multi-mode data generated in the classroom process, wherein the original multi-mode data comprises classroom two-dimensional video data, classroom depth video data and classroom audio data;
the learning behavior analysis module calculates and primarily analyzes the learning behavior of the students on the basis of the multi-mode data source in real time, and is specifically embodied as identifying the modal information of the expressions, actions and languages of the students in the class through an artificial intelligent algorithm;
the teaching activity fusion analysis module is used for generating higher-level activity information based on student behavior information analysis, and is specifically embodied in a way that all mode information is cooperatively represented as matched learning activities through a multi-mode machine learning method;
the class input degree analysis module is used for carrying out input degree analysis calculation aiming at student individuals by combining objective learning activity information in a specific scene, wherein the specific scene is a background category of teaching activities, and comprises teaching, practice and discussion, the specific index dimension of input degree analysis calculation is obtained from an education field knowledge extraction module, the original value of the input degree analysis module is obtained by multiplying a scene row matrix of m columns, a weight matrix of m rows and n columns and an activity column matrix of n rows, the weight matrix is obtained from the education field knowledge extraction module, and the standard value of the index is calculated by a zero-mean value standardization method on the basis of the original value;
the education field knowledge extraction module is used for inquiring and combining expert opinions to form theoretical dimensions and indexes of each level related to classroom investment, wherein the theoretical dimensions and indexes of each level comprise: a learning behavior classification standard related to classroom input, a learning activity classification standard related to classroom input, a teaching scene classification standard related to classroom input, a classroom input measurement dimension and index, and a weight matrix of each learning activity corresponding to each measurement index in each teaching scene;
the visual feedback module is used for visually outputting the behavior, the activity recognition result and the evaluation index calculation result related to the input of the classroom, calculating the learning input index score of each student in each scene in the classroom process, outputting the evaluation index calculation result as an input degree change curve, and outputting video and images in a visual output mode.
Preferably, the multi-mode data acquisition module is composed of 2 4K cameras and 1 depth camera, the two 4K cameras are respectively arranged at the left upper corner and the right upper corner of a classroom blackboard, the depth camera is arranged at the center of the upper edge of the blackboard, the two 4K cameras respectively shoot students at the left half side and the right half side in the classroom, and the central depth camera shoots all students in the classroom forwards.
Preferably, in the learning behavior analysis module, the implementation method for identifying the student modal information in the class through the artificial intelligence algorithm comprises the following steps:
1) The confidence threshold value is adjusted through an artificial intelligence algorithm, all visible teacher and student entities are identified through a computer vision technology, and the positions, the categories and the confidence of the entities in the two-dimensional picture are detected;
2) Combining the two-dimensional entity position information and the entity depth information, performing entity-label mapping by a dynamic tracking algorithm taking the minimized inter-frame entity position offset as an optimization target, wherein the optimization target of the dynamic tracking algorithm is the sum of minimum offsets of all entity Euclidean distances of adjacent frames in a three-dimensional space;
3) Extracting and aligning language information through a voice recognition algorithm and a Chinese word vector algorithm, and converting unstructured language information into 300-dimensional structured vectors by applying a Chinese word vector pre-training model based on a public corpus;
4) And recognizing and obtaining the expression and action states of the teacher and student entities in each frame based on the behavior classification coding standard examined in the education field knowledge extraction module by using an expression and action recognition model based on the training of the public large-scale data set.
Preferably, in the teaching activity fusion analysis module, the implementation steps of the multi-mode machine learning method include:
1) Mapping the expression, action and language mode information of the student entity into the same feature space x;
2) Based on the learning activity classification coding standard examined in the education field knowledge extraction module, training a classification model from modal information such as expression, action, language and the like to learning activities, and performing automatic student entity activity matching coding.
Preferably, in the education domain knowledge extraction module, the steps for forming each level of theoretical dimension and index related to classroom input are as follows:
1) Interfacing with a learning behavior analysis module, preparing a learning behavior classification coding standard, and screening out behaviors and expressions which are highly relevant to classroom teaching behaviors, including head gestures, limb actions, expressions, speech and interpersonal interaction actions according to the relevant actions and expressions which can be identified by a current computer;
2) Interfacing with a teaching activity fusion analysis module, preparing a teaching activity classification coding standard, respectively carrying out listening and speaking, hand-operated experiments/practices, note taking, practice, computer/PAD operation, hand lifting, standing, reading, conversation with a teacher, feeding back the teacher, companion discussion, hand-operated cooperation and coding interpretation on 13 activity states separated from a classroom on students, and constructing a teaching activity automatic analysis coding table;
3) Interfacing with a classroom input degree analysis module, preparing teaching scene classification coding standards, and preparing corresponding scene codes and scene categories under different scene descriptions;
4) Docking with a classroom input degree analysis module, and examining and verifying evaluation index dimensions related to classroom input;
5) And interfacing with a classroom input degree analysis module, and determining a weight matrix from a scene, activities to input evaluation indexes in a subjective and objective combined mode.
Therefore, the human-computer collaborative student class input multi-mode visual analysis system with the structure realizes the interpretability through the disassembly of the model calculation process and the introduction of the domain knowledge, and further directly interprets the behaviors and actions of teachers and students in education through the analysis result of the general scene; the non-invasive multi-mode data acquisition and analysis technology is adopted to analyze learning behaviors and learning inputs, so that the problems that the single-mode information quantity is insufficient and the single-mode information quantity is easily influenced by external factors are solved, and the method has important value for improving the accuracy of collaborative learning input analysis; the scene is introduced into an analysis flow, the change rule of learning investment in each scene is explored, and different learning variables such as behaviors, languages, learning investment degrees and the like of students in different scenes are characterized.
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Drawings
FIG. 1 is a schematic diagram of a man-machine collaborative student class input multi-mode visual analysis system;
FIG. 2 is a schematic diagram illustrating the distribution of a multi-modal data acquisition module according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a process of model training by XGBoost algorithm in the teaching activity fusion analysis module according to the embodiment of the present invention;
FIG. 4 is a video of a live analysis of a classroom output by a visual feedback module according to an embodiment of the present invention;
fig. 5 is a graph of input level change output by the visual feedback module according to an embodiment of the present invention.
Reference numerals
1. A blackboard; 2. a 4K camera; 3. a depth camera; m1, a multi-mode data acquisition module; m2, a learning behavior analysis module; m3, teaching activity fusion analysis module; m4, a classroom input degree analysis module; m5, knowledge extraction module in education field; and M6, a visual feedback module.
Detailed Description
The technical scheme of the invention is further described below through the attached drawings and the embodiments.
Examples
As shown in fig. 1, the human-computer collaborative student classroom input multi-mode visual analysis system comprises a multi-mode data acquisition module M1, a learning behavior analysis module M2, a teaching activity fusion analysis module M3, a classroom input analysis module M4 and a visual feedback module M6 which are sequentially connected, wherein the learning behavior analysis module M2, the teaching activity fusion analysis module M3, the classroom input analysis module M4 and an education field knowledge extraction module M5 are respectively adjacent;
the multi-mode data acquisition module M1 is used for acquiring original multi-mode data generated in a classroom process, wherein the original multi-mode data comprises classroom two-dimensional video data, classroom depth video data and classroom audio data; the multi-mode data acquisition module M1 is composed of 2 4K cameras 2 and 1 depth camera 3, as shown in FIG. 2, the two 4K cameras 2 are respectively arranged at the left upper corner and the right upper corner of the classroom blackboard 1, the depth camera 3 is arranged at the center of the upper edge of the blackboard 1, the two 4K cameras 2 respectively shoot students at the left half side and the right half side in the classroom, and the central depth camera 3 shoots all students in the classroom forwards.
The learning behavior analysis module M2 calculates and primarily analyzes learning behaviors of students in real time based on a multi-mode data source, and is specifically embodied as identifying modal information of expressions, actions and languages of the students in a class through an artificial intelligence algorithm, and the implementation method comprises the following steps:
1) Confidence threshold adjustment is performed through an artificial intelligence algorithm, and all visible teacher and student entities are identified through a computer vision technology. Entity identification is carried out by adopting a Yolo-v5 algorithm, the Yolo-v5 is an open-source target detection network algorithm, and the position, the category and the confidence of the entity in the two-dimensional picture can be detected by following a GPL-3.0 open-source protocol;
2) Combining the two-dimensional entity position information and the entity depth information, performing entity-label mapping by a dynamic tracking algorithm taking the minimum inter-frame entity position offset as an optimization target, wherein the optimization target of the dynamic tracking algorithm is the minimum offset sum of all entity Euclidean distances of adjacent frames in a three-dimensional space (x, y, z). The dynamic tracking algorithm can set a maximum offset threshold L, and when the offset of adjacent frames of a single entity is significantly greater than L, the entity is judged to be abnormal in recognition, and the frame is skipped;
3) Extracting and aligning language information through a voice recognition algorithm and a Chinese word vector algorithm, and converting unstructured language information into 300-dimensional structured vectors by applying a Chinese word vector pre-training model based on a public corpus;
4) And recognizing and obtaining the expression and action states of the teacher and student entities in each frame based on the behavior classification coding standard examined in the education field knowledge extraction module M5 by using an expression and action recognition model trained based on the public large-scale data set. The expression and action recognition is carried out by adopting VGGNet16 and a Slowfast algorithm, wherein the Slowfast is an open-source video understanding network algorithm, and the Apache-2.0 open-source protocol is followed.
The teaching activity fusion analysis module M3 is used for generating higher-level activity information based on student behavior information analysis and is specifically embodied in a mode that each mode information is cooperatively represented as a matched learning activity through a multi-mode machine learning method. The implementation steps of the multi-mode machine learning method comprise:
1) Mapping the expression, action and language mode information of the student entity into the same feature space x;
2) Based on the learning activity classification coding standard examined in the education domain knowledge extraction module M5, training a classification model from expression, action, language and other modal information to learning activities, and performing automatic student entity activity matching coding. The XGBoost algorithm is used here to implement the activity matching process.
The essence of the algorithm is to continuously perform feature splitting to grow decision trees, each round of learning a tree to fit the residual between the predicted value and the actual value of the previous round of model, and the minimization of the objective function is realized through second-order taylor expansion, as shown in fig. 3. Wherein, the objective function is:
the square loss function of the actual value and the predicted value is as follows:
the regularization function is (where T refers to the number of leaves in the decision tree, refers to the L2-modulo square of the predicted value of the decision tree):
when model training is completed to obtain k decision trees, if the score of one sample is to be predicted, according to the characteristics of the sample, the sample falls to a corresponding leaf node in each tree, each leaf node corresponds to a corresponding score, and finally the scores corresponding to each tree are added to obtain the predicted value of the sample.
The class input degree analysis module M4 is configured to perform input degree analysis calculation for an individual student in combination with objective learning activity information in a specific scene, where the specific scene is a background category where teaching activities occur, such as teaching, practice, discussion, and the like, a specific index dimension of input degree analysis calculation is from the education field knowledge extraction module M5, an original value of the input degree analysis calculation is calculated by multiplying a scene row matrix of M columns, a weight matrix of M rows and n columns, and an activity column matrix of n rows, the weight matrix is derived from the education field knowledge extraction module M5, and a standard value of an index is calculated by a z-score (zero-mean normalization) method based on the original value.
The formula aims to normalize the original dataset to a dataset with a mean of 0 and a variance of 1, where μ and σ are the mean and variance of the original dataset, respectively.
The education field knowledge extraction module M5 is configured to query and combine expert opinions to form theoretical dimensions and indexes of each level related to classroom input, where the theoretical dimensions and indexes of each level include: the method comprises the following steps of:
1) Interfacing with a learning behavior analysis module M2, and preparing learning behavior classification coding standards. According to the related actions and expressions which can be recognized by the current computer, 59 actions and expressions which are highly related to classroom teaching actions are selected through education expert discussion, wherein the 59 actions and expressions comprise head gesture 6, limb actions 31, expressions 7, 2 types of speech and 13 human-computer interaction actions, and the human-computer interaction actions are shown in table 1.
TABLE 1 behaviors and expressions related to classroom teaching behaviors
2) Interfacing with a teaching activity fusion analysis module M3 to prepare a teaching activity classification coding standard. Based on the existing teaching behavior analysis indexes such as the France coding system, the S-T coding and the like, the teaching activity automatic analysis coding table is constructed through the discussion of education specialists, and is shown in the table 2.
Table 2 teaching activity automatic analysis coding table
3) And interfacing with a class input degree analysis module M4, preparing teaching scene classification coding standards, and preparing corresponding scene codes and scene categories under different scene descriptions, as shown in a table 3.
Table 3 teaching scene classification coding standard
4) And interfacing with a classroom input analysis module M4 to examine the evaluation index dimension related to classroom input as shown in table 4.
Table 4 evaluation index dimension related to classroom input
5) And the model is in butt joint with a classroom input degree analysis module M4, and the weight matrix from the scene and the activity to each input evaluation index is determined in a subjective and objective combined mode. This step may employ a Delphi Method. The Delphi method is a method for acquiring expert consensus aiming at a specific subject content, firstly, 10-30 expert group members with professional representativeness and authority are selected, and then two indexes are used for pollingDetermining m-n matrix W of corresponding calculation index k of n-class learning activities under m teaching scenes by query and one-round weight determination k The range of values of the elements in the matrix is between 0 and 1.
And the visual feedback module M6 is used for visually outputting the behavior, the activity recognition result and the evaluation index calculation result related to the classroom investment. The visual output mode is video and image output, comprising:
1) Aligned multi-modal data source information, dynamic tracking information, character behavior information, activity matching information, and output as a classroom live analysis video, as shown in fig. 4.
2) And calculating learning input index scores of students in each scene in the classroom process, and outputting as input degree change curves, as shown in fig. 5.
The above system configuration is configured in a computer device including a memory, a processor, a display adapter, a communication interface, and a communication bus, on which a computer program executable on the processor is stored, which when executed implements the steps of the embodiments described above.
Therefore, the man-machine collaborative student class input multi-mode visual analysis system adopting the structure has the following beneficial effects:
1) And the method integrates the field knowledge to promote the interpretation of analysis in a man-machine cooperative mode. In order to support the interpretation of the calculation process, the analysis framework introduces the experience of the education field expert at each key node, and the education expert is consulted by adopting the Delphi method, so that the coding tables of basic actions, teaching activities, teaching scenes and input states are obtained, and a knowledge base is provided for interpretation calculation.
2) The teaching activity is used as a pivot to enhance the universality of the framework. Teaching activities are the basis and key links of classroom observation and analysis. Based on the related education theory, the analysis framework uses teaching activities as bridges to communicate the bottom layer features identifiable by the computer with education high-level semantics. The teaching behavior is analyzed by an end-to-end method, and then the expert weighting mode is adopted to calculate the high-level semantic concepts of learning investment, so that the full-flow automatic analysis is realized.
3) The multi-mode analysis and the scene of the whole process are automatically integrated. The multi-mode data acquisition, analysis and fusion method is adopted, and the comprehensive analysis of the multi-mode teaching behaviors of teachers and students is realized from four aspects of language, action, expression and head gesture. Meanwhile, the teaching scene is automatically identified on the basis of teaching behaviors, so that the calculation of the learning input scenes is realized.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention and not for limiting it, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that: the technical scheme of the invention can be modified or replaced by the same, and the modified technical scheme cannot deviate from the spirit and scope of the technical scheme of the invention.

Claims (3)

1. A man-machine cooperation student class input multi-mode visual analysis system is characterized in that: the system comprises a multi-mode data acquisition module, a learning behavior analysis module, a teaching activity fusion analysis module, a classroom input analysis module and a visual feedback module which are connected in sequence, wherein the learning behavior analysis module, the teaching activity fusion analysis module, the classroom input analysis module and the education field knowledge extraction module are respectively adjacent;
the multi-mode data acquisition module is used for acquiring original multi-mode data generated in the classroom process, wherein the original multi-mode data comprises classroom two-dimensional video data, classroom depth video data and classroom audio data;
the learning behavior analysis module calculates and primarily analyzes the learning behavior of the students on the basis of the multi-mode data source in real time, and is specifically embodied as identifying the modal information of the expressions, actions and languages of the students in the class through an artificial intelligent algorithm;
the teaching activity fusion analysis module is used for generating higher-level activity information based on student behavior information analysis, and is specifically embodied in that each mode of information is cooperatively represented as matched learning activities through a multi-mode machine learning method, and the implementation steps of the multi-mode machine learning method comprise:
1) Mapping the expression, action and language mode information of the student entity into the same feature space x;
2) Training a classification model from expression, action and language modal information to learning activities based on the learning activity classification coding standard examined in the education field knowledge extraction module, and performing automatic student entity activity matching coding;
the class input degree analysis module is used for carrying out input degree analysis calculation aiming at student individuals by combining objective learning activity information in a scene, wherein the scene is a background category of teaching activities, the teaching, exercise and discussion are carried out, the specific index dimension of the input degree analysis calculation is obtained from the education field knowledge extraction module, the original value of the input degree analysis calculation is obtained by multiplying a scene row matrix of m columns, a weight matrix of m rows and n columns and an activity column matrix of n rows, the weight matrix is obtained from the education field knowledge extraction module, and the standard value of the index is calculated by a zero-mean value standardization method on the basis of the original value;
the education field knowledge extraction module is used for inquiring and combining expert opinions to form theoretical dimensions and indexes of each level related to classroom investment, wherein the theoretical dimensions and indexes of each level comprise: a learning behavior classification standard related to classroom input, a learning activity classification standard related to classroom input, a teaching scene classification standard related to classroom input, a classroom input measurement dimension and index, and a weight matrix of each learning activity corresponding to each measurement index in each teaching scene; the steps for forming theoretical dimensions and indexes of each level related to classroom input are as follows:
1) Interfacing with a learning behavior analysis module, preparing a learning behavior classification coding standard, and screening out behaviors and expressions which are highly relevant to classroom teaching behaviors, including head gestures, limb actions, expressions, speech and interpersonal interaction actions according to the relevant actions and expressions which can be identified by a current computer;
2) Interfacing with a teaching activity fusion analysis module, preparing a teaching activity classification coding standard, respectively carrying out listening and speaking, hand-operated experiments/practices, note taking, practice, computer/PAD operation, hand lifting, standing, reading, conversation with a teacher, feeding back the teacher, companion discussion, hand-operated cooperation and coding interpretation on 13 activity states separated from a classroom on students, and constructing a teaching activity automatic analysis coding table;
3) Interfacing with a classroom input degree analysis module, preparing teaching scene classification coding standards, and preparing corresponding scene codes and scene categories under different scene descriptions;
4) Docking with a classroom input degree analysis module, and examining and verifying evaluation index dimensions related to classroom input;
5) The method comprises the steps of interfacing with a classroom input degree analysis module, and determining a weight matrix from a scene, activities to input evaluation indexes in a subjective and objective combined mode;
the visual feedback module is used for visually outputting the behavior, the activity recognition result and the evaluation index calculation result related to the input of the classroom, calculating the learning input index score of each student in each scene in the classroom process, outputting the evaluation index calculation result as an input degree change curve, and outputting video and images in a visual output mode.
2. The human-computer collaborative student class input multi-mode visual analysis system according to claim 1, wherein: the multi-mode data acquisition module is composed of 2 4K cameras and 1 depth camera, the two 4K cameras are respectively arranged at the left upper corner and the right upper corner of a classroom blackboard, the depth camera is arranged at the center of the upper edge of the blackboard, the two 4K cameras respectively shoot students at the left half side and the right half side in the classroom, and the central depth camera shoots all students in the classroom forwards.
3. The human-computer collaborative student class input multi-mode visual analysis system according to claim 1, wherein: in the learning behavior analysis module, the implementation method for identifying the student modal information in the class through the artificial intelligence algorithm comprises the following steps:
1) The confidence threshold value is adjusted through an artificial intelligence algorithm, all visible teacher and student entities are identified through a computer vision technology, and the positions, the categories and the confidence of the entities in the two-dimensional picture are detected;
2) Combining the two-dimensional entity position information and the entity depth information, performing entity-label mapping by a dynamic tracking algorithm taking the minimized inter-frame entity position offset as an optimization target, wherein the optimization target of the dynamic tracking algorithm is the sum of minimum offsets of all entity Euclidean distances of adjacent frames in a three-dimensional space;
3) Extracting and aligning language information through a voice recognition algorithm and a Chinese word vector algorithm, and converting unstructured language information into 300-dimensional structured vectors by applying a Chinese word vector pre-training model based on a public corpus;
4) And recognizing and obtaining the expression and action states of the teacher and student entities in each frame based on the behavior classification coding standard examined in the education field knowledge extraction module by using an expression and action recognition model based on the training of the public large-scale data set.
CN202211621966.8A 2022-12-16 2022-12-16 Multi-mode visual analysis system for class investment of students through man-machine cooperation Active CN115984956B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211621966.8A CN115984956B (en) 2022-12-16 2022-12-16 Multi-mode visual analysis system for class investment of students through man-machine cooperation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211621966.8A CN115984956B (en) 2022-12-16 2022-12-16 Multi-mode visual analysis system for class investment of students through man-machine cooperation

Publications (2)

Publication Number Publication Date
CN115984956A CN115984956A (en) 2023-04-18
CN115984956B true CN115984956B (en) 2023-08-29

Family

ID=85973230

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211621966.8A Active CN115984956B (en) 2022-12-16 2022-12-16 Multi-mode visual analysis system for class investment of students through man-machine cooperation

Country Status (1)

Country Link
CN (1) CN115984956B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117351575B (en) * 2023-12-05 2024-02-27 北京师范大学珠海校区 Nonverbal behavior recognition method and nonverbal behavior recognition device based on text-generated graph data enhancement model

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108805009A (en) * 2018-04-20 2018-11-13 华中师范大学 Classroom learning state monitoring method based on multimodal information fusion and system
CN109697577A (en) * 2019-02-01 2019-04-30 北京清帆科技有限公司 A kind of voice-based Classroom instruction quality evaluation method
CN111275760A (en) * 2020-01-16 2020-06-12 上海工程技术大学 Unmanned aerial vehicle target tracking system and method based on 5G and depth image information
CN114708525A (en) * 2022-03-04 2022-07-05 河北工程大学 Deep learning-based student classroom behavior identification method and system
CN115146975A (en) * 2022-07-08 2022-10-04 华中师范大学 Teacher-machine-student oriented teaching effect evaluation method and system based on deep learning
CN115239527A (en) * 2022-06-27 2022-10-25 重庆市科学技术研究院 Teaching behavior analysis system for teaching characteristic fusion and modeling based on knowledge base

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8682241B2 (en) * 2009-05-12 2014-03-25 International Business Machines Corporation Method and system for improving the quality of teaching through analysis using a virtual teaching device
US20180366013A1 (en) * 2014-08-28 2018-12-20 Ideaphora India Private Limited System and method for providing an interactive visual learning environment for creation, presentation, sharing, organizing and analysis of knowledge on subject matter

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108805009A (en) * 2018-04-20 2018-11-13 华中师范大学 Classroom learning state monitoring method based on multimodal information fusion and system
CN109697577A (en) * 2019-02-01 2019-04-30 北京清帆科技有限公司 A kind of voice-based Classroom instruction quality evaluation method
CN111275760A (en) * 2020-01-16 2020-06-12 上海工程技术大学 Unmanned aerial vehicle target tracking system and method based on 5G and depth image information
CN114708525A (en) * 2022-03-04 2022-07-05 河北工程大学 Deep learning-based student classroom behavior identification method and system
CN115239527A (en) * 2022-06-27 2022-10-25 重庆市科学技术研究院 Teaching behavior analysis system for teaching characteristic fusion and modeling based on knowledge base
CN115146975A (en) * 2022-07-08 2022-10-04 华中师范大学 Teacher-machine-student oriented teaching effect evaluation method and system based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
赵杰等.《智能机器人技术:安保、巡逻、处置类警用机器人研究实践》.机械工业出版社,2021,第266-268页. *

Also Published As

Publication number Publication date
CN115984956A (en) 2023-04-18

Similar Documents

Publication Publication Date Title
CN106503055B (en) A kind of generation method from structured text to iamge description
CN109165552A (en) A kind of gesture recognition method based on human body key point, system and memory
CN115239527B (en) Teaching behavior analysis system based on knowledge base teaching feature fusion and modeling
CN111931585A (en) Classroom concentration degree detection method and device
CN112069970B (en) Classroom teaching event analysis method and device
CN115984956B (en) Multi-mode visual analysis system for class investment of students through man-machine cooperation
CN105468468A (en) Data error correction method and apparatus facing question answering system
CN109598226B (en) Online examination cheating judgment method based on Kinect color and depth information
CN111524578A (en) Psychological assessment device, method and system based on electronic psychological sand table
CN110298597A (en) A kind of assessment method, device and storage medium
CN107578015B (en) First impression recognition and feedback system and method based on deep learning
CN110245253A (en) A kind of Semantic interaction method and system based on environmental information
CN115719516A (en) Multichannel-based classroom teaching behavior identification method and system
Desai et al. ASL Citizen: A Community-Sourced Dataset for Advancing Isolated Sign Language Recognition
CN113888757A (en) Examination paper intelligent analysis method, examination paper intelligent analysis system and storage medium based on benchmarking evaluation
CN111563697A (en) Online classroom student emotion analysis method and system
CN115810163B (en) Teaching evaluation method and system based on AI classroom behavior recognition
CN115719497A (en) Student concentration degree identification method and system
CN115601823A (en) Method for tracking and evaluating concentration degree of primary and secondary school students
US11210335B2 (en) System and method for judging situation of object
Shi et al. A recognition method of learning behaviour in English online classroom based on feature data mining
CN113792104B (en) Medical data error detection method and device based on artificial intelligence and storage medium
CN117455126B (en) Ubiquitous practical training teaching and evaluation management system and method
CN115909152B (en) Intelligent teaching scene analysis system based on group behaviors
CN116226410B (en) Teaching evaluation and feedback method and system for knowledge element connection learner state

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant