CN111178242A

CN111178242A - Student facial expression recognition method and system for online education

Info

Publication number: CN111178242A
Application number: CN201911377459.2A
Authority: CN
Inventors: 王鑫琛; 张鹏; 王添翼; 姚璐; 郑伟华; 黄浩
Original assignee: Shanghai Palm Education Technology Co Ltd
Current assignee: Shanghai Palm Education Technology Co Ltd
Priority date: 2019-12-27
Filing date: 2019-12-27
Publication date: 2020-05-19

Abstract

The invention discloses a student facial expression recognition method facing online education, which comprises the following steps of S1, adopting an artificially labeled student facial expression image, carrying out face detection and face alignment calibration processing, and constructing a student facial expression image data set; s2, retraining a pre-training model by using the student facial expression image data set to obtain a student expression recognition model; s3, acquiring facial expression image prediction student state data of a student based on the student expression recognition model; the method can be implemented based on the system architecture of online education, the face emotion analysis result obtained from the student client is pushed to the teacher client, the computer vision and big data technology are adopted, the face expression of the student can be accurately judged, the mental activities and mental states of the student can be effectively analyzed, the classroom can be favorably mastered in real time of the online learning state of the student, classroom adjustment is carried out, the teaching quality and the classroom enthusiasm of the student are improved, and the method is suitable for popularization.

Description

Student facial expression recognition method and system for online education

Technical Field

The invention relates to the technical field of facial expression recognition, in particular to a student facial expression recognition method and system for online education.

Background

In the field of online education, the mental activities and mental states of students in class are judged based on the image information acquired by the camera, and accurate basis can be provided for evaluating teaching quality and improving the teaching experience method pertinently. The facial expression is rich in larger information amount, and with the rapid development of computer vision technology, the facial expression recognition technology also becomes a current research hotspot, however, many difficulties need to be solved to establish a real-time facial expression recognition system facing online education, and how to effectively extract facial expression features to improve recognition efficiency is also one of the key researches.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a student facial expression recognition method and system for online education.

In order to achieve the purpose, the technical scheme of the invention is as follows:

a student facial expression recognition method for online education comprises

S1, adopting an artificially labeled student face expression image to perform face detection and face alignment calibration processing, and constructing a student face expression image data set;

s2, retraining a pre-training model by using the student facial expression image data set to obtain a student expression recognition model;

and S3, acquiring facial expression image prediction student state data of the student based on the student expression recognition model.

Further, in the above method for recognizing facial expressions of students facing online education, s1. adopting the artificially labeled facial expression images of students to perform face detection and face alignment calibration, and constructing a data set of facial expression images of students, including

S11, constructing a student expression image data set:

adopting an artificially labeled student head portrait screenshot as a data set sample, wherein the labeled label comprises a plurality of expression categories;

s12, image preprocessing:

preprocessing a data set sample image;

s13, face detection:

detecting the face position in the preprocessed image;

s13, labeling key points of the human face; carrying out face key point labeling on the detected face position to obtain face key point labeling information;

s14, face alignment calibration:

and carrying out alignment calibration on the face image according to the face key point marking information.

Further, in the above online education-oriented student facial expression recognition method, the step s2 of retraining the pre-training model by using the student facial expression image data set to obtain a student expression recognition model includes

S21, training a student expression recognition model:

adopting a depth residual error network model parameter after large-scale facial image data set training as the pre-training model for facial expression recognition, wherein only the convolutional layer parameter of the pre-training model is used as an initialization parameter for training a student expression recognition model, and the number of neurons in a full connecting layer of the pre-training model is set to be 512; and training the student facial expression image data set from the beginning, wherein the final output layer of the pre-training model is a Softmax layer, and finally, a Softmax regression model is adopted as a classifier model for student expression recognition to obtain a student expression recognition model through training.

S22, judging whether the specified iteration times are met:

the specified iteration number is 40 epochs; when the training of the model does not reach the set iteration times, continuously inputting an image to adjust the weight hyper-parameter of the model, and stopping training if the set iteration times is reached;

s23, selecting the optimal model weight parameter

After the model stops training, screening out the optimal weight parameter according to the calculation result of the identification accuracy of each epoch, and storing the weight parameter into a file.

Further, in the above method for recognizing facial expressions of students facing online education, the s3. obtaining facial expression image prediction student status data of students based on the student expression recognition model includes:

s31, acquiring a video screenshot related to the expression of the student;

s32, processing the acquired video screenshot to acquire an image for face alignment calibration;

s33, predicting the facial expression of the image subjected to the face alignment calibration;

s34, calculating a prediction score;

and S35, writing the calculation result into a database.

In a second aspect, the invention further provides a device for implementing the method for recognizing the facial expressions of the students facing the online education, which comprises a learning module and a prediction module of a model for recognizing the facial expressions of the students, wherein the learning module and the prediction module are used for learning the facial expressions of the students facing the online education

The execution steps comprise:

the prediction module executes the steps comprising:

In a third aspect, the present invention further provides a system including the above apparatus, the system further includes a student facial expression recognition task message queue management module, a student facial expression recognition task scheduling module, and a business integration module of student facial expression data, wherein:

the student facial expression recognition task message queue management module is used for acquiring student facial expression recognition request messages and student video screenshots of the student client;

the student facial expression recognition task scheduling module is used for receiving the student facial expression recognition request and the student end video screenshot, performing task scheduling, and calling the prediction module to recognize the category of the student facial expression;

and the business integration module of the student facial expression data is used for creating a facial expression recognition result data query request, and the queried result is fed back to the client side of the current class teacher in real time.

Further, in the above system, the execution steps of the student facial expression recognition task message queue management module are as follows:

firstly, enabling a student client to intercept head portrait pictures of students at regular time, uploading the pictures to an Ali cloud object storage server through an HTTP request, and generating a downloading address link;

writing the generated picture downloading link and the classroom related information into a relational database;

and issuing the student facial expression recognition request message by adopting a producer mode of an Aricloud RockMQ message queue system.

Further, in the above system, the student facial expression recognition task scheduling module executes the following steps:

consuming the facial expression recognition request message of the student in the classroom in a consumer mode of a RocktMQ message queue system;

distributing the received task request message to each task execution unit; the process of each task execution unit when executing one task comprises the following steps:

initializing each task in the task set according to the sequence of the task entering the message queue, determining a current task queue, and selecting a first task of the queue as a task needing to be processed currently;

calling a prediction module of a student expression recognition model to recognize the type of the student face expression, executing the current task at the highest speed of a processor, and writing the final execution result of the task into a relational database;

judging whether the current task is executed or not or whether the execution time of the current task is overdue or not, if not, continuing to wait for the current task to be executed; otherwise, releasing the current task; then, continuously judging whether the task is the end of the task queue, if not, removing the task from the task queue, and selecting the first unexecuted task in the queue to continuously execute; otherwise, the task scheduling work of the task execution unit is finished.

Further, in the above system, the business integration module of the student facial expression data executes the following steps:

creating a data query request of the facial expression recognition result through a background service interface;

the inquired result is fed back to the client side of the class teacher in real time;

and updating the query result to a classroom quality supervision and management platform for making a classroom summary report or generating a data statistics report.

The invention has the beneficial effects that:

the method can be implemented based on a system architecture of online education, the face emotion analysis result obtained from the student client is pushed to the teacher client, and the computer vision and big data technology are adopted, so that the face expression of the student can be accurately judged, and the mental activities and mental states of the student can be effectively analyzed; the system can realize real-time and robust facial expression recognition of students based on a big data platform, can process classroom screenshots in real time by adopting a distributed asynchronous computing system, returns a recognition result to a teacher client for judging the current expressions and states of the students, is favorable for a classroom to master the online learning state of the students in real time, performs classroom adjustment, improves teaching quality and classroom enthusiasm of the students, and is suitable for popularization.

Drawings

In order to more clearly illustrate the detailed description of the invention or the technical solutions in the prior art, the drawings that are needed in the detailed description of the invention or the prior art will be briefly described below. Throughout the drawings, like elements or portions are generally identified by like reference numerals. In the drawings, elements or portions are not necessarily drawn to scale.

FIG. 1 is a flowchart of an embodiment of a method for recognizing facial expressions of students facing online education according to the present invention;

fig. 2 is a system architecture diagram of an embodiment of the student facial expression recognition system for online education according to the invention.

Detailed Description

Embodiments of the present invention will be described in detail below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and therefore are only examples, and the protection scope of the present invention is not limited thereby.

It is to be noted that, unless otherwise specified, technical or scientific terms used herein shall have the ordinary meaning as understood by those skilled in the art to which the invention pertains.

Example 1

As shown in figure 1, the method for recognizing the facial expressions of students facing online education comprises the steps of

The invention adopts computer vision and big data technology, can realize a real-time and robust student facial expression recognition method, and improves the facial expression recognition efficiency in the field of online education; when the method is used for online education, the emotion and mental state of students can be effectively analyzed.

Specifically, step S1, adopting the artificially labeled student facial expression image to perform face detection and face alignment calibration processing to construct a student facial expression image data set, which comprises

S11, constructing a student expression image data set:

acquiring enough student head portrait screenshots, wherein the artificially labeled student head portrait screenshots are used as a data set for model learning, labels needing labeling are divided into six categories, namely happiness, difficulty, anger, aversion, fear and calmness, and the definition standard of each category is as follows:

(1) happy: the expression characteristics of the students are that the mouth angles are backwards pulled high, and the like, and the learning emotion is shown as interest and understanding, and the like;

(2) difficult to pass: the expression characteristics of students are that the corners of the mouth are pulled down, the upper eyelids are raised, and the like, and the learning emotion is expressed as boredom, tiredness and the like;

(3) gas generation: the expression characteristics of the students are that the mouths are closed or opened, the nostrils are enlarged, the eyes are large, and the learning emotion is expressed as doubt, tiredness and the like;

(4) aversion: the expression characteristics of the students are that the mouth is closed or the mouth corner is pulled down, the nose is wrinkled, the lower eyelid is striated, and the like, and the learning emotion is expressed as uninteresting and weary;

(5) fear of: the expression characteristics of the students are flush, dare not to face the camera, the eyes are wide and may be squinted, and the learning emotion is expressed as fear, tension and the like;

(6) calming: the expression characteristics of students are natural, five sense organs have no obvious change and the like, and the learning emotion is expressed as uninteresting and thinking and the like.

S12, image preprocessing:

firstly, normalizing the size of the obtained image; in this embodiment, all the acquired image sizes are uniformly converted into 320 × 240 pixel sizes; the image standardization processing prevents the subsequent affine transformation from being influenced, and the subsequent model training processing efficiency is facilitated.

S13, face detection:

an image gradient direction Histogram (HOG, a feature descriptor for computer vision and image processing target detection) feature is combined with a Support Vector Machine (SVM) classifier model to serve as a face detection model of the step, and the face position in the preprocessed image is detected.

When the HOG + SVM is used for face detection, the HOG characteristics are collected by analyzing an image, the appearance and the shape of a local target can be well described by gradient or edge density direction distribution, the characteristics of the image can be described according to the information of each pixel point histogram of the image, and the collected HOG characteristic vectors are used for SVM classification; the SVM is a binary classifier, the SVM classifier is used for training to obtain a face detection model, and the preprocessed image is subjected to face detection to obtain a face region in the image.

S14, labeling key points of the human face:

after the face area is cut and the size is normalized, carrying out face key point labeling by adopting a cascade regression tree to obtain face key point labeling information; the predicted human face key points are five in total and are respectively the left eye, the right eye, the nose tip and two side mouth corners.

S15, face alignment calibration:

and aligning and calibrating the face pose according to the five face key point information marked in the last step, performing operations such as translation, scaling and rotation on a face image region by adopting affine transformation to perform alignment and calibration on the face image, cutting the aligned and calibrated face region, and normalizing the cut face region to 128 × 128 pixels, wherein the cut face region is used as an input image of the facial expression recognition model.

In step s2, retraining the pre-training model by using the student facial expression image data set to obtain a student expression recognition model, including:

s21, training a student expression recognition model:

the algorithm model selected in this embodiment is a depth residual error network model (Resnet-34), and depth residual error network model parameters after large-scale facial image data set training are used as the pre-training model for facial expression recognition, wherein only convolutional layer parameters of the pre-training model are used as initialization parameters of the model of the present invention, the number of neurons in a full connection layer is set to 512, the student facial expression image data set constructed in this embodiment is used for training from the beginning, and the final output layer of the pre-training model is a Softmax layer, so that a Softmax regression model is finally used as a classifier model for student expression recognition; a large number of student facial expression images are input into the depth residual error network model for training and learning, and finally the softmax regression model can calculate a score for each class, then a probability value is obtained through a softmax function, and the class of the input images is determined according to the final probability value; the method of the invention obtains the student expression recognition model through the training in the above way.

In the retraining process of the pre-training model, further, the method comprises a step S22 of judging whether the specified iteration times are met:

in this embodiment, the specified iteration number is 40 epochs, and 1 epoch means that all training data (i.e., student facial expression image data set samples) are sent into the model and the model weight hyper-parameter adjustment is completed; and when the training of the model does not reach the set iteration times, continuously inputting the image to adjust the weight hyper-parameter of the model, and stopping training if the set iteration times is reached.

S23, selecting the optimal model weight parameter

After the model stops training, the optimal weight parameters are screened out according to the calculation result of the identification accuracy of each epoch, and the weight parameters are stored in a file.

The trained student expression recognition model can be used for recognizing the student face expression of online education, acquiring the face expression screenshot of the student, and predicting and judging the learning state of the student.

S3, acquiring facial expression image prediction student state data of the student based on the student expression recognition model, wherein the method comprises the following steps:

s31, acquiring a video screenshot related to the expression of the student:

in online education, students mostly go to class through clients, so that video screenshots can be derived from video screenshots of the clients of the students; the video screenshot can be obtained through a download link of the cloud storage service, and an image recognition task is performed after the picture is downloaded to the local.

S32, processing the acquired video screenshot to obtain an image for face alignment calibration:

the method comprises the following steps:

s321, image preprocessing:

firstly, carrying out size normalization on the acquired images, and uniformly converting all the acquired image sizes into the size of 320 × 240 pixels;

s322, face detection:

and detecting the face position in the preprocessed image by combining the gradient direction histogram feature of the image with a classifier model of a support vector machine.

S323, labeling key points of the human face:

after the face area is cut and the size is normalized, carrying out face key point labeling by adopting a cascade regression tree to obtain face key point labeling information;

s324, face alignment calibration:

and aligning and calibrating the face pose according to the five face key point information marked in the last step, performing operations such as translation, scaling and rotation on a face image region by adopting affine transformation to perform alignment and calibration on the face image, and cutting and normalizing the aligned and calibrated face region to 128 × 128 pixels.

S33, predicting the facial expression:

predicting the facial expression of the acquired image subjected to the face alignment calibration based on the student expression recognition model; that is, the same depth residual error network model as the step s2. learning phase is used in this step, but the weight parameters of the model do not need to be adjusted again through training, and the optimal weight parameters output above are loaded into the model to predict the facial expression type of the input image (the processed image s 324).

S34, calculating a prediction score:

the Softmax regression model in the above steps converts the output of the neural network into probability distribution of each expression category, and can determine the expression category of the avatar according to the category with the highest predicted probability value, and take the probability value as a predicted score; and the prediction score is used for reflecting the face emotion analysis result.

S35, writing the processing result into a database:

and finally, updating the identified analysis result into a database through a back-end interface so as to provide a basis for client feedback or a data analysis sample.

The method can be implemented based on the system architecture of online education, the face emotion analysis result obtained from the student client is pushed to the teacher client, the computer vision and big data technology are adopted, the face expression of the student can be accurately judged, the mental activities and mental states of the student can be effectively analyzed, the teacher is helped to improve the teaching quality and the enthusiasm of the student classroom, and the method has wide application prospect.

Example 2

The invention also provides a device capable of realizing the method of the embodiment 1, which comprises a student facial expression recognition model learning module and a prediction module, wherein,

the student facial expression recognition model learning module executes the steps of:

the prediction module executes the steps comprising:

The method comprises the following steps of S1, adopting the artificially labeled student facial expression image to carry out face detection and face alignment calibration processing to construct a student facial expression image data set

S11, constructing a student expression image data set:

acquiring enough student head portrait screenshots, taking the artificially labeled student head portrait screenshots as a data set for model learning, wherein labels needing labeling are divided into six categories, namely happiness, difficulty, anger, aversion, fear and peace, and the definition standard of each category is as follows:

S12, image preprocessing:

the acquired images are first normalized in size, and all acquired image sizes are converted uniformly to 320 × 240 pixel sizes.

S13, face detection:

S14, labeling key points of the human face:

S15, face alignment calibration:

S2, when the student facial expression recognition model learning module executes the step of retraining the pre-training model by using the student facial expression image data set to obtain the student expression recognition model, the method comprises

S21, training a student expression recognition model:

the algorithm model selected in this embodiment is a depth residual error network model (Resnet-34), and depth residual error network model parameters after large-scale facial image data set training are used as the pre-training model for facial expression recognition, wherein only convolutional layer parameters of the pre-training model are used as initialization parameters of the model of the present invention, the number of neurons in a full connection layer is set to 512, the student facial expression image data set constructed in this embodiment is used for training from the beginning, and finally a Softmax regression model is used as a classifier model for student expression recognition; the student expression recognition model is obtained through the training in the mode.

S22, judging whether the specified iteration times are met:

S23, selecting the optimal model weight parameter

The step S3, when the prediction module acquires the facial expression image of the student and predicts the student state data based on the student expression recognition model, comprises

S31, acquiring a video screenshot related to the expression of the student:

the method comprises the following steps:

s321, image preprocessing:

s322, face detection:

S323, labeling key points of the human face:

s324, face alignment calibration:

S33, predicting the facial expression:

the step adopts the same depth residual error network model as the step S2. the learning stage, but the weight parameters of the model do not need to be adjusted through training again, and the optimal weight parameters are loaded into the model to predict the facial expression type of the input image (the processed image S324).

S34, calculating a prediction score:

S35, writing the processing result into a database:

The device of the embodiment is used for realizing the student facial expression recognition method, and the executed programming languages comprise object-oriented programming languages, such as Java, Smalltalk and C + +, and further comprise conventional procedural programming languages, such as 'C' language or similar programming languages; the principle of the procedure execution steps can refer to the description related to the above embodiment 1, and will not be described herein again.

Example 3

The invention also provides a system comprising the device, the system also comprises a student facial expression recognition task message queue management module, a student facial expression recognition task scheduling module and a business integration module of student facial expression data, wherein:

Specifically, the student facial expression recognition task message queue management module comprises the following execution steps:

firstly, enabling a student client (namely a student end APP of an online education system) to periodically intercept a head portrait picture of a student, uploading the picture to an Ali cloud Object Storage Server (OSS) through an HTTP request, and generating a downloading address link;

further, the generated picture downloading link and other classroom related information are written into a relational database;

and finally, issuing the facial expression recognition request message of the student in the classroom by adopting a producer mode of an Aricloud RockMQ message queue system.

The student facial expression recognition task scheduling module comprises the following execution steps:

firstly, consuming the facial expression recognition request message of the student in the classroom in a consumer mode of a RockettMQ message queue system;

further, the received task request message is reasonably distributed to each task execution unit; because the embodiment of the invention needs to process more concurrent tasks, a distributed asynchronous task scheduling mechanism is adopted to realize the execution of the tasks, and each deployed task execution unit also supports the execution of a plurality of concurrent tasks;

further, as shown in fig. 2, a flow of each task execution unit executing one task includes the following steps:

firstly, initializing each task in a task set according to the sequence of the task entering a message queue, determining a current task queue, and selecting a first task of the queue as a task needing to be processed currently;

further, a prediction module of a student expression recognition model is called, which is shown in fig. 1, and is used for recognizing the type of the student face expression and executing the current task at the highest rate of the processor, and the final execution result of the task is written into the relational database;

further, whether the current task is executed or not or whether the execution time of the current task is overdue is judged, and if the result is negative, the current task is continuously waited for being executed; otherwise, releasing the current task; then, continuously judging whether the task is the end of the task queue, if not, removing the task from the task queue, and selecting the first unexecuted task in the queue to continuously execute; otherwise, the task scheduling work of the task execution unit is finished.

The implementation steps of the business integration module of the student face expression data are as follows:

firstly, creating a data query request of a facial expression recognition result through a background service interface;

furthermore, the inquired result can be fed back to the client side of the current class teacher (namely the APP of the teacher side of the online education system) in real time, and the emotion and the learning state of the student at the moment are judged, so that effective measures can be taken in time to urge and improve the enthusiasm of the class of the student;

and finally, the query result can be updated to a classroom quality supervision and management platform for making a classroom summary report or generating a data statistics report for the use of a class quality supervisor and the like.

The system can realize real-time and robust facial expression recognition of students based on a big data platform, can process classroom screenshots in real time by adopting a distributed asynchronous computing system, returns a recognition result to a teacher client for judging the current expressions and states of the students, is favorable for a classroom to master the online learning state of the students in real time, performs classroom adjustment, improves teaching quality and classroom enthusiasm of the students, and is suitable for popularization.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the present invention, and they should be construed as being included in the following claims and description.

Claims

1. A student facial expression recognition method for online education is characterized by comprising

2. The method for recognizing the facial expressions of the students facing the online education as claimed in claim 1, wherein the S1. adopting the artificially labeled facial expression images of the students to perform the facial detection and the facial alignment calibration to construct the facial expression image data set of the students, including

S11, constructing a student expression image data set:

s12, image preprocessing:

preprocessing a data set sample image;

s13, face detection:

detecting the face position in the preprocessed image;

s14, face alignment calibration:

3. The method for recognizing the facial expressions of the students facing the online education as claimed in claim 2, wherein S2. retraining the pre-training model with the facial expression image data set of the students to obtain the facial expression recognition model of the students comprises

S21, training a student expression recognition model:

4. The method for recognizing the facial expressions of the students facing the online education as claimed in claim 3, wherein S2. retraining the pre-training model with the facial expression image data set of the students to obtain the facial expression recognition model of the students comprises

S22, judging whether the specified iteration times are met:

s23, selecting the optimal model weight parameter

5. The method for recognizing the facial expressions of the students facing the online education as claimed in claim 4, wherein the S3. obtaining the facial expression image prediction student status data of the students based on the facial expression recognition model comprises:

s31, acquiring a video screenshot related to the expression of the student;

s34, calculating a prediction score;

and S35, writing the calculation result into a database.

6. An apparatus for implementing the method for recognizing facial expressions of students facing online education according to any one of claims 1 to 5, comprising a learning module and a prediction module of the student facial expression recognition model, wherein the learning module and the prediction module are used for predicting the facial expressions of the students facing online education

The execution steps comprise:

the prediction module executes the steps comprising:

7. A system comprising the apparatus of claim 6, wherein the system further comprises a student facial expression recognition task message queue management module, a student facial expression recognition task scheduling module, and a business integration module for student facial expression data, wherein:

8. The system of claim 7, wherein:

the student facial expression recognition task message queue management module comprises the following execution steps:

9. The system of claim 8, wherein:

10. The system of claim 9, wherein: