CN115966003A

CN115966003A - System for evaluating online learning efficiency of learner based on emotion recognition

Info

Publication number: CN115966003A
Application number: CN202211456127.5A
Authority: CN
Inventors: 刘慧�; 李创奇; 时清玮; 王宇航; 李东辉; 孙淑静
Original assignee: Henan Normal University
Current assignee: Henan Normal University
Priority date: 2022-11-21
Filing date: 2022-11-21
Publication date: 2023-04-14

Abstract

The invention discloses an emotion recognition-based learner online learning efficiency evaluation system, which comprises a real-time camera monitoring module, an image collection module, an image processing module and a statistical analysis module, wherein the real-time camera monitoring module is used for acquiring key data; the image processing module is used for analyzing and processing data; and the statistical analysis module is used for carrying out statistical analysis on the constructed model data. Compared with the prior art, the invention has the advantages that: the learning efficiency of the learner is comprehensively obtained through scientific weight calculation by combining the facial expression characteristics, the head posture and the hand skeleton behaviors through the high-definition camera carried by the learner in the online teaching process, so that the error rate of emotion recognition only from non-physiological signals is effectively reduced, and the fault tolerance rate is improved.

Description

An online learning efficiency evaluation system for learners based on emotion recognition

技术领域technical field

本发明涉及教育领域，具体是指一种基于情绪识别的学习者线上学习效率评估系统。The invention relates to the field of education, in particular to an emotion recognition-based online learning efficiency evaluation system for learners.

背景技术Background technique

情绪识别原本是指个体对于他人情绪的识别，现多指AI通过获取个体的生理或非生理信号对个体的情绪状态进行自动辨别，是情感计算的一个重要组成部分，是综合了人的感觉、思想和行为的一种状态，在人与人的交流中发挥着重要作用。常见的情绪识别方法主要分成两大类：基于非生理信号的识别和基于生理信号的识别。而基于情绪识别的应用也是愈发的广泛，比如在医学领域，情绪识别能为精神疾病的诊断治疗提供依据；在用户体验方面，对产品进行调研时，获取用户的情绪变化，有利于发现产品存在的问题；在交通领域中，及时检测那些需要高度集中注意力进行操作的工作人员的情绪状态是避免事故发生的一种有效手段。Emotion recognition originally refers to the individual's recognition of other people's emotions. Now it mostly refers to the automatic identification of the individual's emotional state by AI by obtaining the individual's physiological or non-physiological signals. It is an important part of emotional computing. It integrates human feelings, A state of thought and behavior that plays an important role in human-to-human communication. Common emotion recognition methods are mainly divided into two categories: recognition based on non-physiological signals and recognition based on physiological signals. And the application based on emotion recognition is becoming more and more extensive. For example, in the medical field, emotion recognition can provide a basis for the diagnosis and treatment of mental illness; in terms of user experience, when researching products, obtaining the user's emotional changes is conducive to discovering products. Existing problems; In the field of transportation, timely detection of the emotional state of those staff who need a high concentration of attention to operate is an effective means to avoid accidents.

目前，基于情绪识别的技术也逐步应用于教育领域。作为非生理信号识别的基础——监控技术除了用来侦察是否考试作弊以及纠正不规范的上课行为之外，更在更深层次的方面有了结合，比如蜜枣网与微软亚研院的合作，可以对学习者进行长时间、全面的分析，进行教学效率的评估。但是仍具有诸多不足，基于非生理信号的情绪识别基础可靠性较低，而在上述合作中仅仅通过面部表情的判断来确定学习者的学习效率，可靠性差，准确率低。而本方案主要提出了一种基于非生理性的情绪识别方法，通过头部姿态检测、情绪识别与骨骼检测判断学习者线上学习效率的评估系统，大幅提高检测准确率。At present, technology based on emotion recognition is also gradually applied in the field of education. As the basis of non-physiological signal recognition - in addition to being used to detect cheating in exams and correct irregular class behaviors, monitoring technology has also been combined in deeper aspects. For example, the cooperation between Canzao.com and Microsoft Asia Research Institute can Conduct long-term and comprehensive analysis of learners to evaluate teaching efficiency. However, there are still many shortcomings. The basic reliability of emotion recognition based on non-physiological signals is low. In the above cooperation, only the judgment of facial expressions is used to determine the learning efficiency of learners, which has poor reliability and low accuracy. This solution mainly proposes a method based on non-physiological emotion recognition, which is an evaluation system for judging learners' online learning efficiency through head posture detection, emotion recognition and bone detection, which greatly improves the detection accuracy.

发明内容Contents of the invention

本发明目的是提供一种基于情绪识别的学习者线上学习效率评估系统。The purpose of the present invention is to provide a learner's online learning efficiency evaluation system based on emotion recognition.

为解决上述技术问题，本发明提供的技术方案为：一种基于情绪识别的学习者线上学习效率评估系统，包括基于系统架构的实时摄像监控模块、图像收集模块、图像处理模块、统计分析模块，所述实时摄像监控模块用于关键数据的获取；所述图像处理模块用于数据的分析处理；所述统计分析模块用于对构建的模型数据进行统计分析；In order to solve the above technical problems, the technical solution provided by the present invention is: an online learning efficiency evaluation system for learners based on emotion recognition, including a real-time camera monitoring module based on the system architecture, an image collection module, an image processing module, and a statistical analysis module , the real-time camera monitoring module is used for the acquisition of key data; the image processing module is used for data analysis and processing; the statistical analysis module is used for statistical analysis of the constructed model data;

一种基于情绪识别的学习者线上学习效率评估系统的工作步骤如下：The working steps of an online learning efficiency evaluation system for learners based on emotion recognition are as follows:

1)基于视频图像的人脸检测，给定一个视频影像，要求定位和检测出一个人的头部姿态、人脸位置、颈部以及手臂、五指的屈伸与姿态，即三个部分，图像分割、检测物锁定、特征提取；1) Face detection based on video images. Given a video image, it is required to locate and detect a person's head posture, face position, neck, arm, and five-finger flexion and extension and posture, that is, three parts, image segmentation , detection object locking, feature extraction;

2)面部表情的分析判断，基于纹理特征的方向梯度直方图(HOG)对图片降维进行特征提取，提高模型提取特征效率，从输入层次上提高性能，并结合线性插值进行加权运算，充分的表现图像细节和整体特征，并将CBAM添加到网络中；2) The analysis and judgment of facial expressions, based on the histogram of oriented gradients (HOG) of texture features, performs feature extraction for image dimensionality reduction, improves the efficiency of model extraction features, improves performance from the input level, and combines linear interpolation for weighted operations, fully Express image details and overall features, and add CBAM to the network;

3)特征定义，通过表情特征将学习者情感状态主要划分为三类，分别是“开心”、“中性”、“无聊”，通过面部表情结合扭转角度与头部姿态来判断这三种状态。3) Feature definition. The emotional states of the learners are mainly divided into three categories through the expression features, which are "happy", "neutral" and "boring". These three states are judged by combining facial expressions with twist angles and head postures .

本发明与现有技术相比的优点在于：本方案通过线上授课过程中自带的高清摄像头实时读取学习者头部特征，将脸部表情特征、头部姿态与手部骨骼行为进行有机的结合，三点一体通过科学的权重计算，综合得出学习者的学习效率，从而有效降低仅仅从非生理性信号进行情绪识别的失误率，提高容错率。Compared with the prior art, the present invention has the advantages that: this solution reads the head features of the learners in real time through the high-definition camera that comes with it during the online teaching process, and organically integrates the facial expression features, head posture and hand bone behavior Through the combination of scientific weight calculation, the three-point integration can comprehensively obtain the learning efficiency of learners, so as to effectively reduce the error rate of emotion recognition only from non-physiological signals and improve the error tolerance rate.

进一步的，实时摄像监控模块通过对网课过程中笔记本及电脑摄像头的控制，实现对学生在屏幕前的学习状态的实时监控，并将取到的视频信号实时传入计算机主控运算中心。Furthermore, the real-time camera monitoring module realizes the real-time monitoring of the learning status of the students in front of the screen through the control of the notebook and the computer camera during the online class, and transmits the acquired video signal to the computer main control operation center in real time.

进一步的，实时摄像监控模块包括人脸检测功能、表情识别功能、头部姿态分析功能、躯体行为判断功能。Further, the real-time camera monitoring module includes face detection function, facial expression recognition function, head posture analysis function, and body behavior judgment function.

进一步的，图像收集模块通过调整设备的读取频率，在一定时间内数据化的次数，自动收集需要的音频与视频，作为数据材料，为后续模块运行提供数据基础。Furthermore, the image collection module automatically collects the required audio and video by adjusting the reading frequency of the device and the number of digitization times within a certain period of time, as data materials, and provides a data basis for subsequent module operations.

进一步的，图像处理模块实现图像目标的人脸检测，提取脸部特征区域，划分为468个关键点，并根据模型实时追踪学习者手臂、五指等躯体行为，为下一模块提供参考数据。Further, the image processing module realizes the face detection of the image target, extracts the facial feature area, divides it into 468 key points, and tracks the learner's arm, five fingers and other body behaviors in real time according to the model, providing reference data for the next module.

进一步的，统计分析模块通过上一模块提供的数据支持，输入模型进行比对，得到学生对课程内容的关注度、参与度，并当某一时刻参与度较低时，进行时间上的记录与实时的提醒。Further, the statistical analysis module uses the data support provided by the previous module to input the model for comparison to obtain the degree of attention and participation of students to the course content, and when the degree of participation is low at a certain moment, the time record and Real-time reminders.

附图说明Description of drawings

图1是本发明的的架构示意图。FIG. 1 is a schematic diagram of the architecture of the present invention.

图2是本发明的人脸特征点提取的框图。Fig. 2 is a block diagram of facial feature point extraction of the present invention.

图3(a)是本发明的对于学生个体的特写图像示意图。Fig. 3(a) is a schematic diagram of a close-up image of an individual student according to the present invention.

图3(b)是本发明的侧面局部图像对人脸特征点的检测结果的示意图。FIG. 3( b ) is a schematic diagram of the detection results of facial feature points in side partial images according to the present invention.

图3(c)是本发明的部分肢体器官的检测的示意图。Fig. 3(c) is a schematic diagram of the detection of some limb organs in the present invention.

图4是本发明的基于表情的初步判断框架图。Fig. 4 is a frame diagram of the preliminary judgment based on expression of the present invention.

图5是本发明的线上教学中常见的学生心理和面部特征分析示意图。Fig. 5 is a schematic diagram of analysis of students' psychology and facial features common in online teaching of the present invention.

图6是实施例中表情特征识别的数据示意图。Fig. 6 is a schematic diagram of data for facial expression feature recognition in the embodiment.

图7(a)是实施例中一种面部表情的判断示意图。Fig. 7(a) is a schematic diagram of judging a facial expression in the embodiment.

图7(b)是实施例中另一种面部表情的判断示意图。Fig. 7(b) is a schematic diagram of judging another facial expression in the embodiment.

图8是实施例中对表情特征的实时追踪示意图。Fig. 8 is a schematic diagram of real-time tracking of expression features in the embodiment.

图9(a)是实施例中正脸与头部姿态转角α状态下系统观察正脸的表情特征示意图。Fig. 9(a) is a schematic diagram of the expression characteristics of the frontal face observed by the system under the state of the frontal face and the head posture rotation angle α in the embodiment.

图9(b)是实施例中头部与头部姿态转角α状态下系统观察侧脸的表情特征示意图。Fig. 9(b) is a schematic diagram of the expression characteristics of the side face observed by the system under the state of the head and the head posture rotation angle α in the embodiment.

图9(c)是实施例中正脸与头部姿态仰角β状态下系统观察正脸的表情特征示意图。Fig. 9(c) is a schematic diagram of the expression characteristics of the frontal face observed systematically under the state of elevation angle β between the frontal face and the head posture in the embodiment.

图9(d)是实施例中正脸与头部姿态仰角β状态下观察仰脸的表情特征示意图。Fig. 9(d) is a schematic diagram of the expression features of the upward face observed under the state of elevation angle β between the frontal face and the head posture in the embodiment.

图10是本发明的的系统功能示意图。Fig. 10 is a schematic diagram of system functions of the present invention.

具体实施方式Detailed ways

下面结合附图对本发明做进一步的详细说明。The present invention will be described in further detail below in conjunction with the accompanying drawings.

本发明在具体实施时，如图1所示的实施例中，提出一种基于表情识别与姿态行为的学生课堂参与度评估系统，内部板块主要包括视频录取板块、数据获取板块、头部姿态、表情、躯体行为识别板块与实验分析评价板块，具体的：1)实施摄像监控板块，通过对网课过程中笔记本及电脑摄像头的控制，保证在网课期间对学生在屏幕前的学习状态进行实时的监控。并将取到的视频信号实时传入计算机主控运算中心，初步采用多方位姿态处理办法，既要得到脸部表情特征，又要的得到躯体姿态、五指特征。但是智能分析判断时既可以只通过面部表情特征姿态识别，又可以结合躯体姿态进行识别。When the present invention is actually implemented, in the embodiment shown in Figure 1, a kind of student's classroom participation evaluation system based on expression recognition and gesture behavior is proposed, and the internal board mainly includes video recording board, data acquisition board, head posture, The expression and body behavior recognition section and the experimental analysis and evaluation section, specifically: 1) Implement the camera monitoring section, through the control of the notebook and computer camera during the online class, to ensure real-time monitoring of the students' learning status in front of the screen during the online class monitoring. And the acquired video signal is sent to the computer main control operation center in real time, and the multi-directional gesture processing method is initially adopted, not only to obtain facial expression features, but also to obtain body posture and five-finger features. However, intelligent analysis and judgment can not only recognize facial expression feature gestures, but also combine body postures for recognition.

2)图像收集板块。通过调整设备的读取频率，在一定时间内数据化的次数，自动收集需要的音频与视频，作为数据材料，为后续模块运行提供数据基础。2) Image collection plate. By adjusting the reading frequency of the device and the number of data conversions within a certain period of time, the required audio and video are automatically collected as data materials to provide a data basis for subsequent module operations.

3)图像处理板块。实现图像目标的人脸检测，提取脸部特征区域，划分为468个关键点，并根据模型实时追踪学习者手臂、五指等躯体行为，为下一模块提供参考数据。3) Image processing module. Realize the face detection of the image target, extract the facial feature area, divide it into 468 key points, and track the learner's arm, five fingers and other body behaviors in real time according to the model, and provide reference data for the next module.

4)统计分析模块。通过上一模块提供的数据支持，输入模型进行比对，得到学生对课程内容的关注度、参与度，并当某一时刻参与度较低时，进行时间上的记录与实时的提醒。4) Statistical analysis module. Through the data support provided by the previous module, the input model is compared to obtain the degree of attention and participation of students to the course content, and when the degree of participation is low at a certain moment, time records and real-time reminders are carried out.

在本发明的一个实施例中，如图2、图3(a)、图3(b)、图3(c)、图4图5所示，本系统的具体工作步骤如下：In one embodiment of the present invention, as shown in Fig. 2, Fig. 3 (a), Fig. 3 (b), Fig. 3 (c), Fig. 4 and Fig. 5, the specific working steps of the system are as follows:

(一)基于视频图像的人脸检测方法(1) Face detection method based on video image

在目前的线上教学过程中，学习者会呈现不同的姿态、表情与行为，且受于线上教学的影响，每位学习者上课环境不同，因此监控视频影像中的场景较为复杂多变。但是也正是因为线上教学，学习者与学习屏幕之间的距离不会超出一定的范围，而自带的摄像头通常可以完成较为清晰的对目标的连续拍摄，所以可以有效捕捉学习者的人脸信息。In the current online teaching process, learners will show different postures, expressions and behaviors, and affected by online teaching, each learner has a different class environment, so the scenes in the surveillance video images are more complex and changeable. But it is precisely because of online teaching that the distance between the learner and the learning screen will not exceed a certain range, and the built-in camera can usually complete relatively clear continuous shooting of the target, so it can effectively capture the learner's personality. face information.

给定一个视频影像，要求定位和检测出一个人的头部姿态、人脸位置、颈部以及手臂、五指的屈伸与姿态，即三个部分，图像分割、检测物锁定、特征提取。由于人脸的差异性与环境的复杂性使得面部与肢体特征点的定位与跟踪变得非常具有挑战性。目前，基于模型的定位是主流的定位方法，XGBoost(eXtreme Gradient Boosting)极致梯度提升，是一种基于GBDT的算法或者说工程实现，因为其高效、灵活、轻便的特点成为了推荐系统、数据挖掘等领域得到了广泛的应用。本文在引入XGBoost模型的基础上，结合RESnet的机理实现人脸特征的提取。基于现有的XGBoost方法，结合现有模型与数据，在网课中进行多姿态人脸特征的提取。Given a video image, it is required to locate and detect a person's head posture, face position, neck, arm, and five-finger flexion and extension and posture, that is, three parts, image segmentation, detection object locking, and feature extraction. Due to the diversity of human faces and the complexity of the environment, it is very challenging to locate and track facial and body feature points. At present, model-based positioning is the mainstream positioning method. XGBoost (eXtreme Gradient Boosting) is an algorithm or engineering implementation based on GBDT. and other fields have been widely used. Based on the introduction of the XGBoost model, this paper combines the mechanism of RESnet to realize the extraction of facial features. Based on the existing XGBoost method, combined with the existing model and data, the multi-pose facial features are extracted in the online class.

如图3(a)所示对于学生个体的特写图像，本方法可有效地识别其头部转角，以及脸部特征区域，对于眼镜、手部，乃至口罩的遮挡都通过有效的降噪算法进行了优化，体现了其稳定性与健壮性。图3(b)为针对侧面局部图像对人脸特征点的检测结果，除了个别脸部偏转或者遮挡过大的情况会无法准确识别其特征点。图3(c)为对部分肢体器官的检测，锁定期手部特征，观察其细节行为。As shown in Figure 3(a), for the close-up image of an individual student, this method can effectively identify the corner of the head and the feature area of the face, and the occlusion of the glasses, hands, and even the mask is carried out through an effective noise reduction algorithm Optimized to reflect its stability and robustness. Figure 3(b) shows the detection results of face feature points for partial side images, except for individual face deflection or excessive occlusion, the feature points cannot be accurately identified. Figure 3(c) is the detection of some limbs and organs, the characteristics of the hand during the locking period, and the detailed behavior is observed.

(二)面部表情的分析判断方法(2) Analyzing and judging methods of facial expressions

表情是情绪的外在体现与表达，更是研究人类的一种客观指标，美国著名心理学专家保罗·艾克曼的面部感情记录技术可以测量出愉快、惊奇、厌恶、愤怒、惧怕、悲伤6种情绪(孟昭兰，1987)。毋庸置疑的是这六种表情不能很好的适用于教育，但是本文从中截取其特色，使之可以更好的适用于教育领域。在正常的上课体验中，学生的表情给老师传递大量的情感信息，反映了学生对这门课的理解程度，表现了其听懂、困惑等情感特征。学生在能听懂的基础上表现出来的面部表情是轻松愉快的，头部状态是紧盯老师的，跟着老师的节奏移动，这代表了他对当前学习内容很感兴趣，参与度很高。而当学生对该内容表现出无聊甚至是抗拒的表现，而行为特征也表现为长期低头，左顾右盼，又或者眉头紧锁，甚至是趴下睡觉，多数是参与度较低或者学习者不能听懂教师所讲的内容。Facial expression is the external manifestation and expression of emotions, and it is also an objective indicator for the study of human beings. The facial emotion recording technology of the famous American psychologist Paul Ekman can measure happiness, surprise, disgust, anger, fear, sadness6 emotions (Meng Zhaolan, 1987). There is no doubt that these six expressions are not suitable for education, but this article extracts their characteristics from them, so that they can be better applied to the field of education. In the normal class experience, the student's expression conveys a lot of emotional information to the teacher, which reflects the student's understanding of the course, and expresses emotional characteristics such as understanding and confusion. The facial expression shown by the student on the basis of understanding is relaxed and happy. The state of the head is fixed on the teacher and moves with the rhythm of the teacher. This means that he is very interested in the current learning content and has a high degree of participation. And when students show boredom or even resistance to the content, and the behavioral characteristics also show that they lower their heads for a long time, look left and right, or frown, or even lie down to sleep, most of them have low participation or learners cannot understand What the teacher said.

表情是基于生理的一种自然表达，从人类本能出发，不受大脑的控制，所以不能掩饰、更不能伪装。表情有诸多方面的应用，比如在司法领域刑侦方面的应用，判断犯罪嫌疑人是否说谎；可以用于有关精神方面的病情治防领域，帮助对患者进行疾病的治疗；更可以应用于教育领域，帮助教师更好的了解学习者的情感状况，从而进行情感弥补，使学生更好的学习，提高学习效率。Expression is a natural expression based on physiology. It starts from human instinct and is not controlled by the brain, so it cannot be concealed, let alone disguised. Emoticons have many applications, such as the application in criminal investigation in the judicial field, to judge whether a criminal suspect is lying; it can be used in the field of prevention and treatment of mental illness, to help treat patients; it can also be used in the field of education, Help teachers better understand the emotional state of learners, so as to make up emotionally, so that students can learn better and improve learning efficiency.

基于纹理特征的方向梯度直方图(HOG)对图片降维进行特征提取，提高模型提取特征效率，从输入层次上提高性能，并结合线性插值进行加权运算，充分的表现图像细节和整体特征。将CBAM(注意力模块)添加到网络中，以提高准确率。卷积注意力模块(Convolutional BlockAttention Module，CBAM)，是由SanghyunWoo等人在2018年提出，这是一种用于前馈卷积神经网络的简单而有效的注意力模块。给定一个中间特征图，CBAM模块依次从通道和空间两个不同的维度推算注意力图，然后将注意力图与输入特征图相乘，以此进行自适应特征优化。针对不同表情有各自不同的特征，我们创新性地提出了在不同表情上采取不同的注意力学习算法机制，采用空间变换网络聚焦，使网络能够集中于脸部的重要部位，既减少了参数量，减轻梯度爆炸的影响，还以此来增加学习的效率和准确率，可以更好地应用于智慧教育。The Histogram of Oriented Gradients (HOG) based on texture features extracts features for image dimensionality reduction, improves the efficiency of model extraction features, improves performance from the input level, and combines linear interpolation for weighted operations to fully express image details and overall features. Add CBAM (attention module) to the network to improve accuracy. Convolutional Block Attention Module (CBAM), proposed by SanghyunWoo et al. in 2018, is a simple and effective attention module for feedforward convolutional neural networks. Given an intermediate feature map, the CBAM module sequentially infers the attention map from two different dimensions of channel and space, and then multiplies the attention map with the input feature map to perform adaptive feature optimization. In view of the different characteristics of different expressions, we innovatively proposed different attention learning algorithm mechanisms for different expressions, using space transformation network focusing, so that the network can focus on important parts of the face, which not only reduces the amount of parameters , to alleviate the impact of gradient explosion, and to increase the efficiency and accuracy of learning, which can be better applied to smart education.

给定一张人脸图像或人脸视频，并非人脸的所有部分在检测特定情绪方面都很重要，在许多情况下，我们只需关注特定区域即可感知潜在情绪。基于这一观察，同时受到“因材施教”思想的启发，我们自主提出添加一种注意机制，不同表情上采取不同的注意力学习算法，通过空间变换网络转换为我们的框架，以关注重要的面部区域。Given an image or video of a face, not all parts of the face are important in detecting a particular emotion, and in many cases we only need to focus on specific regions to perceive the underlying emotion. Based on this observation and inspired by the idea of "teaching students in accordance with their aptitude", we independently propose to add an attention mechanism, adopt different attention learning algorithms for different expressions, and convert them into our framework through a spatial transformation network to focus on important facial regions. .

在线上教学的环境中，我们引入了姿态识别与人脸表情识别技术，加上头部、肢体姿态检测，分析表情在客观上的状态与学习者主观行为下的心理状态，根据计算机智能分析得到参与度，并且可以在参与度长时较低情况下进行”Alarm”，其次，也可以方便教师掌握教学情况。In the online teaching environment, we have introduced posture recognition and facial expression recognition technology, together with head and body posture detection, to analyze the objective state of expressions and the psychological state of learners under subjective behavior. According to computer intelligent analysis, Participation, and "Alarm" can be carried out when the participation is low for a long time. Secondly, it can also facilitate teachers to grasp the teaching situation.

(三)特征定义(3) Feature definition

我们通过表情特征将学习者情感状态主要划分为三类，分别是开心、中性、无聊。依据伊扎德面部编码系统，通过面部表情结合扭转角度与头部姿态来判断这四种状态，具体如图5所示。We mainly divide learners' emotional states into three categories through expression features, which are happy, neutral, and bored. According to the Izzard facial coding system, these four states are judged by combining facial expressions with twist angles and head postures, as shown in Figure 5.

进一步的，在本发明的一个实施例中，本实施例将学习者头部姿态及面部表情行为相结合，通过计算机的高性能运算的优势，对线上教育进行了有效的跟防，使线上学习效率得到提高，也使教师对学生学习状况得到了解，从而针对薄弱环节进行督促与教学补偿。通过对学生面部表情与头部姿态转角进行分析，记录学生的状态，及时帮助学生调整自己的学习方式和学习方法，制定具有针对性的方案，实现学生能力和学生个性化的不断发展。同时，若能在教学和交流中，准确地把握学生对课堂的参与度，则可以有效提高管理效率，了解学生的真实心理情况，辨别学生在平时的表现中是否存在说谎行为，及时调整管理思路和工作方法，建立和学生之间的信任感，同时对可能发生的突发事件做好充足的准备。Furthermore, in one embodiment of the present invention, this embodiment combines the learner's head posture and facial expression behavior, and takes advantage of the high-performance computing of the computer to effectively follow up on online education, making online education The efficiency of online learning is improved, and it also enables teachers to understand the learning status of students, so as to supervise and make teaching compensation for weak links. By analyzing students' facial expressions and head posture angles, recording students' status, helping students adjust their learning styles and learning methods in a timely manner, formulating targeted programs, and realizing the continuous development of students' abilities and individuality. At the same time, if we can accurately grasp the participation of students in the classroom during teaching and communication, we can effectively improve management efficiency, understand the true psychological conditions of students, identify whether students are lying in their usual performance, and adjust management ideas in a timely manner And working method, establish the sense of trust between teachers and students, and at the same time make adequate preparations for possible emergencies.

系统的操作流程具体包括：The operating procedures of the system specifically include:

(一)表情特征识别(1) Expression feature recognition

越深的网络结构，其拟合能力应该越强，然而，当网络的深度达到一定程度时，网络的性能不升反降。常规的神经网络，是将卷积层、全连接层等结构按照一定的顺序简单地连接到一起，每层结构仅接受来自上一层的信息，并在本层处理后传递给下一层。当网络层次加深后，这种单一的连接方式会导致神经网络性能退化。当引入Residual残差学习时，这种不足就能得以改善，所谓的残差学习，就是在上述的单一连接方式的基础上，加入了“短连接”(shortcutconnections)。而ResNet中最重要的是残差学习单元，残差网络是在网络上加上skipconnections(跳过连接层)。The deeper the network structure, the stronger its fitting ability should be. However, when the depth of the network reaches a certain level, the performance of the network will not increase but decrease. Conventional neural networks simply connect convolutional layers, fully-connected layers and other structures together in a certain order, and each layer structure only accepts information from the previous layer, and passes it to the next layer after processing in this layer. When the network level is deepened, this single connection method will lead to the degradation of neural network performance. This deficiency can be improved when Residual residual learning is introduced. The so-called residual learning is to add "shortcut connections" on the basis of the above-mentioned single connection method. The most important thing in ResNet is the residual learning unit, and the residual network is to add skipconnections (skip connection layer) to the network.

同时，由于在网络中每一次的反向传播过程，所有的权重会同时更新，导致层间配合“缺乏默契”，而且层数越多，相互配合越困难，还需要对每层输入的分布有所控制。我们应用了BatchNormalization(批归一化)，BatchNormalization具有某种正则作用，使用了这个方法之后，就不需要使用dropout方法了，同时还减少了过拟合。At the same time, due to every backpropagation process in the network, all weights will be updated at the same time, resulting in a "lack of tacit understanding" between layers, and the more layers, the more difficult it is to cooperate with each other. It is also necessary to have a certain understanding of the distribution of the input of each layer. controlled by. We applied BatchNormalization (batch normalization). BatchNormalization has some kind of regularization effect. After using this method, there is no need to use the dropout method, and it also reduces overfitting.

而作为智能分析的基础分类，对脸部表情的判断尤为重要，经过在ResNet、GoogLeNet、AlexNet、VGG等诸多模型的实验比对下，我们选择了效果最佳的ResNet模型，把实验的准确率提升到最高，我们在训练集上的实验正确率高达97.59％，如图6所示。设输出函数为H(x)，F(x)为残差函数，x为输入则：As the basic classification of intelligent analysis, the judgment of facial expressions is particularly important. After comparing experiments with many models such as ResNet, GoogLeNet, AlexNet, VGG, etc., we selected the ResNet model with the best effect, and compared the accuracy of the experiment Boosted to the highest, our experimental accuracy on the training set is as high as 97.59%, as shown in Figure 6. Let the output function be H(x), F(x) be the residual function, and x be the input:

H(x)＝F(x)+xH(x)=F(x)+x

通过深度学习训练模型，可以得到有关面部表情的判断，如图7(a)与图7(b)所示。并且可以随时间的实时展现，对表情特征的实时追踪如图8所示：By training the model with deep learning, judgments about facial expressions can be obtained, as shown in Figure 7(a) and Figure 7(b). And it can be displayed in real time over time, and the real-time tracking of expression features is shown in Figure 8:

(二)头部姿态判断(2) Judgment of head posture

引入在两个方向的夹角α与β，对头部姿态进行状态跟踪。当学生本身处于开心或者设初始参与度系数为α(t)＝-1，β(t)＝-1，t为实时变化的时间，当夹角α随时间变化的度数|α(t)|不超过30°、β随时间变化的度数|β(t)|不超过15°时，关注度系数a(t)＝1、β(t)＝1，否则关注度系数即为0。则基于线上课堂的参与度的判断具体见计算公式(1)：The angle α and β in two directions are introduced to track the state of the head pose. When the student is happy or the initial participation coefficient is α(t)=-1, β(t)=-1, t is the time of real-time change, when the angle α changes with time |α(t)| When the degree of β changing with time |β(t)| is not more than 30°, and the degree |β(t)| of β does not exceed 15°, the attention degree coefficient a(t)=1, β(t)=1, otherwise the attention degree coefficient is 0. For the judgment based on the degree of participation in the online classroom, please refer to the calculation formula (1):

而当学生本身处于消极的一种状态属性，或者面部检测认定失败，又或者角度偏转大于三十度则可基本定义为没有参与课堂行为，当持续超过一个标准的时间则由该系统进行提醒。的计算公式如课堂参与时间的计算公式T见公式(2)：When the student is in a negative state attribute, or the face detection fails, or the angle deflection is greater than 30 degrees, it can be basically defined as not participating in the classroom behavior. When it lasts for more than a standard time, the system will remind. For example, the calculation formula T of classroom participation time is shown in formula (2):

其中N线上上课中学习者一共被检测的次数，而n为其中检测报告为积极参与课堂的次数，T_o则是总共的上课时长，因此基于此公式可以得到课堂的参与时间，从而侧面表现学生对课堂的感兴趣程度与学习效率，帮助教师进行统筹协调规划。Among them, N is the total number of times learners are tested in online classes, and n is the number of times that the test reports are actively participating in the class, and T _o is the total class time. Therefore, based on this formula, the class participation time can be obtained, and thus the side performance Students are interested in the classroom and learning efficiency, helping teachers to coordinate and plan.

为了有效观察人脸的表情特征，针对脸部特征点对其头部姿态转角α，如图9(a)与图9(b)所示，正脸与扭转之间的夹角为头部姿态转角α。如图9(c)与图9(d)所示，正脸与仰视之间的夹角为头部姿态仰角β，针对头部姿态转角α、头部姿态仰角β等参数进行细致的分析与判断。In order to effectively observe the expression features of the human face, the head posture rotation angle α is aimed at the facial feature points, as shown in Figure 9(a) and Figure 9(b), the angle between the frontal face and the twist is the head posture angle α. As shown in Figure 9(c) and Figure 9(d), the angle between the frontal view and the upward view is the head attitude elevation angle β, and the parameters such as the head attitude rotation angle α and the head attitude elevation angle β are carefully analyzed and compared. judge.

以上显示和描述了本发明的基本原理和主要特征以及发明的优点，本行业的技术人员应该了解，本发明不受上述实施例的限制，上述实施例和说明书中描述的只是说明本发明的原理，在不脱离本发明精神和范围的前提下，本发明还会有各种变化和改进，这些变化和改进都落入要求保护的本发明范围内。本发明要求保护范围由所附的权利要求书及其等效物界定。The above shows and describes the basic principles and main features of the present invention and the advantages of the invention. Those skilled in the art should understand that the present invention is not limited by the above-mentioned embodiments. What are described in the above-mentioned embodiments and description are only to illustrate the principles of the present invention , On the premise of not departing from the spirit and scope of the present invention, there will be various changes and improvements in the present invention, and these changes and improvements all fall within the scope of the claimed invention. The protection scope of the present invention is defined by the appended claims and their equivalents.

Claims

1. The utility model provides a learner online learning efficiency evaluation system based on emotion recognition, includes real-time camera monitoring module, image collection module, image processing module, statistical analysis module based on system architecture, its characterized in that: the real-time camera monitoring module is used for acquiring key data; the image processing module is used for analyzing and processing data; the statistical analysis module is used for carrying out statistical analysis on the constructed model data;

the working steps of the learner online learning efficiency evaluation system based on emotion recognition are as follows:

1) Based on the face detection of a video image, a video image is given, the head posture, the face position, the neck, the arms and the flexion and extension postures of a person, namely three parts, namely image segmentation, detection object locking and feature extraction are required to be positioned and detected;

2) Analyzing and judging facial expressions, performing feature extraction on image dimensionality reduction based on a direction gradient Histogram (HOG) of textural features, improving the efficiency of model extraction features, improving the performance from an input level, performing weighting operation by combining linear interpolation, fully expressing image details and overall features, and adding CBAM into a network;

3) The feature definition divides the emotional state of the learner into three categories, namely 'happy', 'neutral' and 'boring', through the expression feature, and the three states are judged through combining the facial expression with the torsion angle and the head posture.

2. The system of claim 1, wherein the system comprises: the real-time camera monitoring module realizes real-time monitoring of the learning state of the students in front of the screen through controlling the notebook computer and the computer camera in the course of the network course, and transmits the acquired video signals to the computer main control operation center in real time.

3. The system of claim 2, wherein the system comprises: the real-time camera monitoring module comprises a face detection function, an expression recognition function, a head posture analysis function and a body behavior judgment function.

4. The system of claim 1, wherein the system comprises: the image collection module automatically collects required audio and video as data materials by adjusting the reading frequency of the equipment and the number of times of datamation in a certain time, and provides a data base for the operation of subsequent modules.

5. The system of claim 1, wherein the system comprises: the image processing module realizes the face detection of an image target, extracts a face characteristic region, divides the face characteristic region into 468 key points, tracks the body behaviors of the arm, the five fingers and the like of a learner in real time according to the model and provides reference data for the next module.

6. The system of claim 1, wherein the system comprises: the statistical analysis module is used for inputting a model for comparison through data support provided by the previous module to obtain the attention and participation of students to the course content, and recording the time and reminding the students in real time when the participation at a certain moment is low.