CN112836679B

CN112836679B - A Fast Expression Recognition Algorithm and System Based on Dual Model Probabilistic Optimization

Info

Publication number: CN112836679B
Application number: CN202110233127.8A
Authority: CN
Inventors: 李宪; 李炎; 潘亚磊; 杨明业; 于继宇
Original assignee: Qingdao University
Current assignee: Qingdao University
Priority date: 2021-03-03
Filing date: 2021-03-03
Publication date: 2022-06-14
Anticipated expiration: 2041-03-03
Also published as: CN112836679A

Abstract

The present invention proposes a fast expression recognition algorithm and system based on dual model probability optimization, which mainly includes a face recognition cropping module, an image preprocessing module, a dual model prediction module and a combined probability optimization module; It is sent to the dual-model classifiers trained by the standard image and the binarized image for parallel judgment. At the same time, two optimization algorithms are proposed to optimize the combined probability of the output results of the dual-model, using two light weights with low accuracy. The probabilistic optimization of the recognition results of the neural network model achieves high-accuracy judgment and recognition. The invention can realize that the recognition rate of pain that lasts for a threshold time and above is not less than 99%, and greatly reduces the cost of calculation and storage, and can also effectively recognize other kinds of expressions.

Description

A Fast Expression Recognition Algorithm and System Based on Dual Model Probabilistic Optimization

技术领域technical field

本发明属于图像处理与情感识别领域，具体为一种基于双模型概率优化的快速表情识别算法和系统。The invention belongs to the field of image processing and emotion recognition, in particular to a fast expression recognition algorithm and system based on double-model probability optimization.

背景技术Background technique

随着社会的发展，人们对急性病的关注度越来越高，由于急性病发病急剧，症状严重且变化快，因此很难做到急性病症的有效预防且病发时往往伴随剧烈的疼痛，以致病人无法呼救，无法得到及时的救治，患者的生命受到很大的威胁。目前，对于急性病发的识别问题，普遍使用人员陪护，视频监控或在患者身体加装特定传感装置，造成人力资源浪费的同时给患者行动带来不便，现有的神经网络虽已具有非常高的准确率，但存储与计算成本较高，这些都给特种监护智能终端的推广带来了局限。With the development of society, people are paying more and more attention to acute diseases. Due to the rapid onset of acute diseases, severe symptoms and rapid changes, it is difficult to effectively prevent acute diseases and the onset of acute diseases is often accompanied by severe pain, which causes the disease to suffer. Unable to call for help, unable to receive timely treatment, the patient's life is greatly threatened. At present, for the identification of acute disease, personnel escort, video surveillance or installation of specific sensing devices on the patient's body are commonly used, which wastes human resources and brings inconvenience to the patient's movement. Although the existing neural network has a very high The accuracy rate is high, but the storage and computing costs are high, which has brought limitations to the promotion of special monitoring intelligent terminals.

表情通常分为6类通用表情：快乐、悲伤、恐惧、愤怒、厌恶和惊讶，面部和面部表情代表了个人的思维行为，面部表情传达了55％的信息，而只有38％和7％的信息与语言和声音有关，因此，即使疼痛并没有被概念化为一种情绪，但是它可以通过表情进行表达。传统的表情识别算法通常分为两类：一是静态图像分类，具有更快的计算速度，但不能很好的捕捉面部表情的时空特征；二是动态视频分类，可以有效的捕捉面部表情的时空特征，但计算更为复杂。大多集中于对网络结构和设计模式的更改，模型的准确率非常高，但计算成本与存储成本较高，不适合计算能力较低的智能终端部署。Expressions are usually divided into 6 general expressions: happy, sad, fearful, angry, disgusted and surprised, the face and facial expressions represent the thinking behavior of the individual, the facial expressions convey 55% of the information, while only 38% and 7% of the information It is related to language and sound, so even though pain is not conceptualized as an emotion, it can be expressed through expressions. Traditional expression recognition algorithms are usually divided into two categories: one is static image classification, which has faster computing speed, but cannot capture the spatiotemporal features of facial expressions; the other is dynamic video classification, which can effectively capture the spatiotemporal characteristics of facial expressions. features, but the calculation is more complex. Most of them focus on changing the network structure and design mode. The accuracy of the model is very high, but the computational cost and storage cost are high, and it is not suitable for the deployment of intelligent terminals with low computing power.

CN111466878A公开了一种基于表情识别的卧床病人疼痛症状实时监护方法及装置，该方法包括：1建立疼痛表情训练数据集；2建立用于分析疼痛表情的神经网络模型并进行训练，从而得到疼痛分级模型；3采集得到同一时刻人脸的三张实时图像并进行预处理后输入神经网络模型中，从而得到A个疼痛级别所对应的概率；并选取最大概率所对应的疼痛级别作为当前时刻检测图像的疼痛级别，并对超过阈值的疼痛级别进行报警处理，以实现实时监护。该发明能对疼痛进行实时和自动化的准确评估，从而实现卧床病人的有效监护。CN111466878A discloses a real-time monitoring method and device for pain symptoms of bedridden patients based on facial expression recognition, the method includes: 1 establishing a pain expression training data set; 2 establishing and training a neural network model for analyzing pain expressions, so as to obtain a pain classification model; 3. Collect three real-time images of the face at the same time and input them into the neural network model after preprocessing, so as to obtain the probability corresponding to A pain level; and select the pain level corresponding to the maximum probability as the current moment detection image The pain level is higher than the threshold, and alarm processing is performed for the pain level exceeding the threshold to realize real-time monitoring. The invention can perform real-time and automatic accurate assessment of pain, so as to realize effective monitoring of bedridden patients.

CN106682616B公开了一种基于双通道特征深度学习的新生儿疼痛表情识别方法。该方法首先对新生儿面部图像进行灰度化,提取局部二值模式(Local Binary Pattern,LBP)特征图谱；然后用一个双通道卷积神经网络对并行输入的新生儿面部图像的灰度图及其LBP特征图两个通道的特征进行深度学习；最后采用基于softmax的分类器对两个通道的融合特征进行表情分类,分为平静、哭、轻度疼痛、剧烈疼痛四类表情。该方法结合灰度图像及其LBP特征图谱两个通道的特征信息,能够有效地识别出平静、哭、轻度疼痛、剧烈疼痛等表情,并对新生儿面部图像的光照、噪声与遮挡问题具有很好的鲁棒性,为开发出新生儿疼痛表情识别系统提供了一种新的方法和途径，但计算成本偏高，不利于智能终端的部署。CN106682616B discloses a neonatal pain expression recognition method based on dual-channel feature deep learning. The method firstly grayscales the newborn's facial image to extract the Local Binary Pattern (LBP) feature map; The features of the two channels of the LBP feature map are used for deep learning; finally, the softmax-based classifier is used to classify the expressions of the fusion features of the two channels, which are divided into four categories: calm, crying, mild pain, and severe pain. The method combines the feature information of two channels of gray image and its LBP feature map, which can effectively identify expressions such as calm, crying, mild pain, severe pain, etc. It has good robustness and provides a new method and approach for the development of a neonatal pain expression recognition system, but the computational cost is high, which is not conducive to the deployment of smart terminals.

CN202011036047.5公开了一种护理床上老人痛苦表情的智能识别方法，结合深度置信网络从人脸图像中提取痛苦表情特征，该特征更能有效描述表情，同时针对痛苦表情识别中的样本容量小的问题，通过结合已标记样本和未标记样本，利用生成模型，通过半监督学习方法进行识别。该发明重点在于对神经网络的结构与参数进行调整，可以有效地对痛苦表情进行识别，但神经网络一旦训练完成则准确率不可调整，这归因于神经网络训练过程中的不确定性。CN202011036047.5 discloses an intelligent recognition method for the painful expression of the elderly on a nursing bed, which extracts painful expression features from face images in combination with a deep belief network. Problems are identified through semi-supervised learning methods by combining labeled and unlabeled samples, using generative models. The invention focuses on adjusting the structure and parameters of the neural network, which can effectively identify painful expressions, but the accuracy cannot be adjusted once the neural network is trained, which is due to the uncertainty in the training process of the neural network.

为解决急性病发时的识别问题，本发明提出了一种根据病人表情，对持续阈值时间及以上的痛苦进行识别，并可有效忽略偶发性疼痛的急症病发识别算法。In order to solve the identification problem of acute illness, the present invention proposes an acute illness identification algorithm that can identify pain lasting a threshold time and above according to the patient's facial expression, and can effectively ignore occasional pain.

发明内容SUMMARY OF THE INVENTION

为解决上述问题，本发明提出了一种基于两种由静态图片训练的简单模型，通过组合概率优化算法对表情动作峰值进行连续判断的算法。它可以快速且准确的对设定阈值时间及以上的持续性痛苦进行识别，并可以忽略短暂的偶发性痛苦。In order to solve the above problem, the present invention proposes an algorithm based on two simple models trained by static pictures, through a combined probability optimization algorithm to continuously judge the expression action peak value. It can quickly and accurately identify persistent pain for a set threshold time and above, and can ignore brief episodic pain.

为解决上述技术问题，本发明提出一种基于双模型概率优化的快速表情识别算法，主要基于对双模型的输出结果进行组合概率优化，实现对表情动作的识别，具体包括以下技术方案：In order to solve the above-mentioned technical problems, the present invention proposes a fast expression recognition algorithm based on the probability optimization of the dual models, which is mainly based on the combined probability optimization of the output results of the dual models to realize the recognition of facial expressions, and specifically includes the following technical solutions:

一种基于双模型概率优化的快速表情识别算法，具体包括以下步骤：A fast expression recognition algorithm based on dual-model probability optimization, which specifically includes the following steps:

步骤1对摄像头送入图像进行取帧，裁剪出人脸图像，作为标准图像；Step 1: Take the frame of the image sent by the camera, and cut out the face image as a standard image;

通过opencv提供的库函数，将视频取帧送入Haar人脸级联器，通过分类器检索图像中的人脸，并将人脸裁剪为标准图像，为图像预处理做准备；Through the library function provided by opencv, the video frame is sent to the Haar face cascader, the face in the image is retrieved through the classifier, and the face is cropped into a standard image to prepare for image preprocessing;

步骤2对标准图像进行二值化预处理，同时为降低无效特征的干扰进行中值滤波，得到二值图像；Step 2: Binarization preprocessing is performed on the standard image, and median filtering is performed to reduce the interference of invalid features to obtain a binary image;

将标准图像进行灰度化处理并进行二值化，同时为降低胡须，斑点等无关特征的影响，使用中值滤波算法对图像进行降噪处理；The standard image is grayed and binarized, and at the same time, in order to reduce the influence of irrelevant features such as whiskers and spots, the image is denoised by the median filter algorithm;

步骤3将标准图像和二值图像分别送入Mini_Xception模型和CNN7模型并行判断，识别图像中的人脸表情类别；Step 3: The standard image and the binary image are respectively sent to the Mini_Xception model and the CNN7 model for parallel judgment, and the facial expression category in the image is recognized;

使用CNN7模型和Mini_Xception模型分别单独对图像进行判断，人脸表情类别分为正常和痛苦两类，在判断过程中记录痛苦为1，正常为0；The CNN7 model and the Mini_Xception model are used to judge the images separately. The facial expression categories are divided into two categories: normal and painful. During the judgment process, the pain is recorded as 1, and the normal as 0;

步骤4若双模型的输出结果均为痛苦，即CNN7模型和Mini_Xception模型的判定结果相等且都为1时，激活优化算法，转至步骤5，进行组合概率优化并输出最终结果，反之转至步骤1；Step 4 If the output results of the two models are all painful, that is, when the judgment results of the CNN7 model and the Mini_Xception model are equal and both are 1, activate the optimization algorithm, go to Step 5, perform combined probability optimization and output the final result, otherwise go to Step 5 1;

步骤5计数器sum置1，并对此后的L张图像进行统计，并按照优化算法进行累加；Step 5: The counter sum is set to 1, and the following L images are counted and accumulated according to the optimization algorithm;

当计数器sum置1时，记录随后的L张图像的判断结果，当使用优化算法一进行工作时，每出现一次双模型均判断为痛苦时，计数器sum加1；当使用优化算法二进行工作时，当仅有模型一判断为痛苦时sum加W，当仅有模型二判断为痛苦时sum加1-W；When the counter sum is set to 1, the judgment results of the subsequent L images are recorded. When using the optimization algorithm 1 to work, each time the double model is judged to be painful, the counter sum is increased by 1; when the optimization algorithm 2 is used to work , add W to sum when only model 1 is judged to be painful, and add 1-W to sum when only model 2 judges to be painful;

步骤6当在L张图像中sum累加至K或大于K时，转至步骤7，反之转至步骤1；Step 6: When sum is accumulated to K or greater than K in L images, go to Step 7, otherwise go to Step 1;

当使用优化算法一时，计数器sum等于阈值K时进入下一步骤7；当使用优化算法二时，计数器sum大于等于K时进入下一步骤7；When using optimization algorithm 1, when the counter sum is equal to the threshold K, enter the next step 7; when using the optimization algorithm 2, when the counter sum is greater than or equal to K, enter the next step 7;

步骤7输出警报，完成对表情动作的判断；转至步骤1；Step 7 outputs an alarm to complete the judgment on the facial expression; go to step 1;

当满足输出条件时，程序将循环打印Warning，直至人工调停。When the output conditions are met, the program will print Warning cyclically until manual mediation.

进一步的，所述CNN7模型主要由两部分组成，第一部分由7组卷积模块组成，使用了二维卷积层(Conv2D层)，Dropout层以及最大池化层(MaxPooling层)，这7组卷积模块均使用Relu作为激活函数，Relu可以使神经网络的计算成本下降，并提高梯度下降和反向传播的效率；第二部分由全连接层组成，第一层全连接层有128个神经元，激活函数为Relu，第二层全连接层由2个神经元组成，激活函数为softmax起到分类作用。Further, the CNN7 model is mainly composed of two parts. The first part is composed of 7 groups of convolution modules, using a two-dimensional convolution layer (Conv2D layer), a Dropout layer and a maximum pooling layer (MaxPooling layer). The convolution modules all use Relu as the activation function. Relu can reduce the computational cost of the neural network and improve the efficiency of gradient descent and backpropagation; the second part consists of a fully connected layer, and the first fully connected layer has 128 neurons. The activation function is Relu, the second fully connected layer consists of 2 neurons, and the activation function is softmax to play a classification role.

所述Mini_Xception主要由3部分组成，第一部分为两个Conv2D层，为降低过拟合都使用了批标准化层(BN)并以Relu为激活函数；第二部分由五个双通道模块构成，其中左通道由一个Conv2D层和一个BN层组成，右通道由两个模块组成，第一个模块由深度可分离卷积层、BN层和Relu层组成，第二个模块由深度可分离卷积层和BN层组成，为了降低计算量连接了MaxPooling层,最终使用add层将两个通道合并在一起送入下一模块；第三部分由Conv2D层、Dropout层和全局平均池化层组成，最后使用softmax激活函数，对输出特征进行分类。The Mini_Xception mainly consists of three parts. The first part is two Conv2D layers. To reduce over-fitting, batch normalization layers (BN) are used and Relu is used as the activation function; the second part is composed of five dual-channel modules, of which The left channel consists of a Conv2D layer and a BN layer, the right channel consists of two modules, the first module consists of a depthwise separable convolutional layer, a BN layer and a Relu layer, and the second module consists of a depthwise separable convolutional layer It is composed of the BN layer. In order to reduce the amount of calculation, the MaxPooling layer is connected. Finally, the add layer is used to merge the two channels together and send it to the next module. The third part is composed of the Conv2D layer, the Dropout layer and the global average pooling layer. Finally, use The softmax activation function to classify the output features.

进一步的，所述概率优化使用优化算法的数学模型对双模型的输出结果进行组合，并对表情动作整体做出判断。为此，本发明中提出两种概率优化算法，均可以实现对问题的解决。更优选的，本发明选用优化算法二，优化算法二是对优化算法一的改良，可以更好的协调L和K之间的关系，从而在保证准确率的同时尽可能地缩小误报区间和误报率。Further, the probabilistic optimization uses the mathematical model of the optimization algorithm to combine the output results of the dual models, and makes an overall judgment on the expression action. Therefore, two probability optimization algorithms are proposed in the present invention, both of which can solve the problem. More preferably, the present invention selects the second optimization algorithm, and the second optimization algorithm is an improvement of the first optimization algorithm, which can better coordinate the relationship between L and K, so as to reduce the false alarm interval as much as possible while ensuring the accuracy. False alarm rate.

具体的，所述优化算法一为：Specifically, the first optimization algorithm is:

当双模型预测结果都为痛苦时，计数器sum＝1，取随后的L张图像，每出现一次双模型预测为痛苦，计数器sum加1，当sum等于K时，产生中断并触发警报，其公式为:When the prediction results of the dual models are all pain, the counter sum = 1, and the following L images are taken. Every time the dual model predicts pain, the counter sum increases by 1. When the sum is equal to K, an interrupt is generated and an alarm is triggered. The formula for:

其中，P为算法总成功率，P₀₀表示双模型同时判断为痛苦的概率，K表示设定的阈值，L表示出现双模型同时判断为痛苦后，取随后的L张图像进行统计，P_start表示痛苦序列出现时，前n张照片中有一张出现双模型判断均为痛苦的概率，其公式为：Among them, P is the total success rate of the algorithm, P ₀₀ is the probability that the dual models are simultaneously judged to be pain, K is the set threshold, L means that after the dual models are simultaneously judged to be pain, the subsequent L images are taken for statistics, P _start It represents the probability that when the pain sequence appears, one of the first n photos appears to be pain in the judgment of the dual model, and its formula is:

具体的，所述优化算法二为：Specifically, the second optimization algorithm is:

设模型一所占权重为W，模型二所占权重为1-W，当双模型预测结果都为痛苦时，计数器sum＝1，取随后的L张图像，每出现一次双模型预测为痛苦，计数器sum加1，当模型一判断为痛苦，模型二判断为正常时，sum加W，当模型二判断为痛苦，模型一判断为正常时，sum加1-W，当sum大于等于K时，产生中断并触发警报；其中，K为设定的阈值，L表示出现双模型同时判断为痛苦后，取随后的L张图像进行统计；Let the weight of model one be W, and the weight of model two to be 1-W. When the prediction results of both models are pain, the counter sum=1, and the following L images are taken. Each time the dual model predicts pain, The counter sum is incremented by 1. When model 1 judges pain and model 2 judges normal, sum adds W. When model 2 judges pain and model 1 judges normal, sum adds 1-W. When sum is greater than or equal to K, An interrupt is generated and an alarm is triggered; among them, K is the set threshold, and L means that after the dual models are simultaneously judged as pain, the subsequent L images are taken for statistics;

设存在0≤a≤L，0≤b≤L使得式aW+b(1-W)<K，则算法成功率公式为：Assuming that 0≤a≤L and 0≤b≤L exist such that the formula aW+b(1-W)<K, the algorithm success rate formula is:

其中P₁表示模型一判断正确的概率，P₂表示模型二判断正确的概率，a,b分别表示判断正确的图像数量，*表示所有(a,b)可能的集合；显然在区间[0,L]内存在实数a,b使不等式aW+b(1-W)＜K，成立。Among them, P ₁ represents the probability of correct judgment of model 1, P ₂ represents the probability of correct judgment of model 2, a and b respectively represent the number of correctly judged images, * represents all possible sets of (a, b); obviously in the interval [0, b The existence of real numbers a and b in L] makes the inequality aW+b(1-W)<K hold.

本发明的目的还在于提供一种基于双模型概率优化的快速表情识别系统，主要包括四个模块，The purpose of the present invention is also to provide a fast expression recognition system based on dual-model probability optimization, which mainly includes four modules,

第一，人脸识别裁剪模块，将人脸从摄像头输出的图像中裁剪出来，作为标准图像；First, the face recognition cropping module, which crops the face from the image output by the camera as a standard image;

第二，图像预处理模块，对标准图像进行二值化，同时为降低无效特征的干扰进行中值滤波，得到二值图像；Second, the image preprocessing module, which binarizes the standard image and performs median filtering to reduce the interference of invalid features to obtain a binary image;

第三，双模型预测模块，将标准图像和二值图像分别送入Mini_Xception模型和CNN7模型并行判断，识别图像中的人脸表情类别；人脸表情类别分为正常和痛苦两类，在判断过程中记录痛苦为1，正常为0，通过对结果的对比决定下一步骤的执行；Third, the dual-model prediction module sends the standard image and the binary image to the Mini_Xception model and the CNN7 model for parallel judgment, and recognizes the facial expression category in the image; the facial expression category is divided into two categories: normal and painful, and in the judgment process The pain is recorded as 1 in the middle, and the normal value is 0, and the execution of the next step is determined by comparing the results;

第四，组合概率优化模块，对Mini_Xception模型和CNN7模型的输出结果进行概率优化并输出最终结果；具体过程为：判断是否双模型的输出结果均为痛苦，如是则计数器sum置1，并对此后的L张图像进行统计，并按照优化算法进行累加；当在L张图像中sum累加至阈值K或大于K时，输出警报，完成对表情动作的判断；反之转至模块一。Fourth, combine the probability optimization module to perform probability optimization on the output results of the Mini_Xception model and the CNN7 model and output the final results; the specific process is: determine whether the output results of the dual models are painful, if so, set the counter sum to 1, and then The L images obtained are counted and accumulated according to the optimization algorithm; when the sum of the L images is accumulated to the threshold K or greater than K, an alarm is output to complete the judgment of the expression action; otherwise, go to module one.

为实现急性病发时的自动识别并降低计算与存储成本，本发明提出了根据病人痛苦表情动作峰值判断的急性病发识别算法和系统，输入图像截取人脸并进行二值化处理，分别送入由标准图像和二值图像训练的双模型分类器并行判断，同时提出了两种优化算法对双模型输出结果进行组合概率优化，利用两个准确率不高的轻量级神经网络模型识别结果的概率优化实现了高准确率的判断与识别。本发明可以实现对持续阈值时间及以上的痛苦识别率不低于99％，并极大地降低计算成本和存储成本，同时也可对其他种类的表情进行有效识别。In order to realize the automatic identification of acute disease and reduce the cost of calculation and storage, the present invention proposes an acute disease recognition algorithm and system based on the peak value of the patient's painful expression and action. The dual-model classifiers trained on standard images and binary images make parallel judgments, and at the same time, two optimization algorithms are proposed to optimize the combined probability of the output results of the dual models, using two lightweight neural network models with low accuracy to identify the probability of the results The optimization realizes high-accuracy judgment and recognition. The invention can realize that the recognition rate of pain that lasts for a threshold time and above is not less than 99%, and greatly reduces the cost of calculation and storage, and can also effectively recognize other kinds of expressions.

与现有技术相比，本发明具有以下有益效果和进步：Compared with the prior art, the present invention has the following beneficial effects and progress:

本发明在神经网络判断的基础上，基于优化算法对神经网络的输出结果进行优化组合，实现了对各种持续性表情动作的高准确率识别。本发明一方面使用结构简单的神经网络模型和思路简洁的算法，有效降低了计算成本和存储成本，且计算速度较快；另一方面提出了概率优化算法数学模型，相比单纯的神经网络的不确定性，拥有更好的可控性。同时，本发明在急性病发报警，社会情感识别等领域具有很高的应用价值。On the basis of the judgment of the neural network, the invention optimizes the combination of the output results of the neural network based on the optimization algorithm, and realizes the high-accuracy recognition of various persistent facial expressions. On the one hand, the present invention uses a neural network model with a simple structure and a simple algorithm, which effectively reduces the calculation cost and storage cost, and the calculation speed is relatively fast; Uncertainty and better controllability. At the same time, the invention has high application value in the fields of acute disease alarm, social emotion recognition and the like.

附图说明Description of drawings

图1是本发明识别方法的结构流程图；Fig. 1 is the structural flow chart of the identification method of the present invention;

图2是CNN7结构示意图；Figure 2 is a schematic diagram of the structure of CNN7;

图3是Mini_Xception结构示意图；Figure 3 is a schematic diagram of the Mini_Xception structure;

图4是概率优化算法思路示意图。Figure 4 is a schematic diagram of the probabilistic optimization algorithm idea.

具体实施方式Detailed ways

下面结合附图对本发明的具体实施方式作进一步详细说明：The specific embodiments of the present invention will be described in further detail below in conjunction with the accompanying drawings:

本实施例提供了一种基于双模型概率优化的快速表情识别系统，主要包括四个模块，见图1。第一，人脸识别裁剪模块，将人脸从摄像头输出的图像中裁剪出来，作为标准图像；第二，图像预处理模块，对标准图像进行二值化，同时为降低无效特征的干扰进行中值滤波，得到二值图像；第三，双模型预测模块，识别图像中的人脸表情类别；第四，组合概率优化模块，对双模型的输出结果进行概率优化并输出最终结果。This embodiment provides a fast expression recognition system based on dual-model probability optimization, which mainly includes four modules, as shown in FIG. 1 . First, the face recognition cropping module cuts the face from the image output by the camera as a standard image; second, the image preprocessing module binarizes the standard image, and is in the process of reducing the interference of invalid features. Value filtering to obtain a binary image; third, a dual-model prediction module to identify the facial expression categories in the image; fourth, a combined probability optimization module to perform probability optimization on the output results of the dual models and output the final result.

在使用概率优化算法对模型输出结果进行总体判断前，需要使用分类模型对面部表情进行分类，本实施例将标准图像和二值图像分别送入Mini_Xception模型和CNN7模型并行判断，识别图像中的人脸表情类别；人脸表情类别分为正常和痛苦两类，在判断过程中记录痛苦为1，正常为0，通过对结果的对比决定下一步骤的执行；Before using the probability optimization algorithm to make an overall judgment on the model output results, a classification model needs to be used to classify the facial expressions. In this embodiment, the standard image and the binary image are respectively sent to the Mini_Xception model and the CNN7 model for parallel judgment to identify the person in the image. Facial expression category; the facial expression category is divided into two categories: normal and painful. In the judgment process, the pain is recorded as 1, and the normal is 0, and the execution of the next step is determined by comparing the results;

(1)CNN7(1)CNN7

CNN7主要由两部分组成，第一部分由7组卷积模块组成，使用了二维卷积层(Conv2D层)，Dropout层以及最大池化层(MaxPooling层)，这7组卷积模块均使用Relu作为激活函数，Relu可以使神经网络的计算成本下降，并提高梯度下降和反向传播的效率。第二部分由全连接层组成，第一层全连接层有128个神经元，激活函数为Relu，第二层全连接层由2个神经元组成，激活函数为softmax起到分类作用。其网络结构如图2所示。CNN7 is mainly composed of two parts. The first part is composed of 7 groups of convolution modules, using two-dimensional convolution layer (Conv2D layer), Dropout layer and maximum pooling layer (MaxPooling layer). These 7 groups of convolution modules all use Relu As an activation function, Relu can reduce the computational cost of neural networks and improve the efficiency of gradient descent and backpropagation. The second part consists of a fully connected layer, the first fully connected layer has 128 neurons, the activation function is Relu, the second fully connected layer consists of 2 neurons, and the activation function is softmax for classification. Its network structure is shown in Figure 2.

(2)Mini_Xception(2)Mini_Xception

Mini_Xception主要由3部分组成，第一部分为两个Conv2D层，为降低过拟合都使用了批标准化层(BN)并以Relu为激活函数，第二部分由五个双通道模块构成，其中左通道由一个Conv2D层和一个BN层组成，右通道由两个模块组成，第一个模块由深度可分离卷积层，BN层，Relu层组成，第二个模块由深度可分离卷积层，BN层组成，为了降低计算量连接了MaxPooling层,最终使用add层将两个通道合并在一起送入下一模块，第三部分由Conv2D层，Dropout层，全局平均池化层组成，最后使用softmax激活函数，对输出特征进行分类，其结构图如图3所示。Mini_Xception mainly consists of 3 parts. The first part consists of two Conv2D layers. In order to reduce over-fitting, batch normalization layer (BN) is used and Relu is used as the activation function. The second part consists of five dual-channel modules, of which the left channel is used. It consists of a Conv2D layer and a BN layer, the right channel consists of two modules, the first module consists of a depthwise separable convolutional layer, a BN layer, and a Relu layer, and the second module consists of a depthwise separable convolutional layer, BN In order to reduce the amount of calculation, the MaxPooling layer is connected, and finally the add layer is used to merge the two channels together and send it to the next module. The third part consists of the Conv2D layer, the Dropout layer, the global average pooling layer, and finally uses the softmax activation. function to classify the output features, and its structure is shown in Figure 3.

基于两个模型的判断准确率，建立了概率优化算法的数学模型，对模型的输出结果进行组合，并对表情动作整体做出判断。算法思路示意图如图4所示；Based on the judgment accuracy of the two models, a mathematical model of the probabilistic optimization algorithm is established, the output results of the models are combined, and the overall expression of the expression is judged. The schematic diagram of the algorithm idea is shown in Figure 4;

(1)优化算法一(1) Optimization algorithm one

当双模型预测结果都为痛苦时，计数器sum＝1，取随后的L张图像，每出现一次双模型预测为痛苦，计数器sum加1，当sum等于K时，产生中断并触发警报。其公式为:When both models predict pain, the counter sum = 1, and the following L images are taken. Every time the double model predicts pain, the counter sum increases by 1. When the sum is equal to K, an interrupt is generated and an alarm is triggered. Its formula is:

(2)优化算法二(2) Optimization algorithm two

设模型一所占权重为W，模型二所占权重为1-W，当双模型预测结果都为痛苦时，计数器sum＝1，取随后的L张图像，每出现一次双模型预测为痛苦，计数器sum加1，当模型一判断为痛苦，模型二判断为正常时，sum加W，当模型二判断为痛苦，模型一判断为正常时，sum加1-W，当sum大于等于K时，产生中断并触发警报。其中，K为设定的阈值，L表示出现双模型同时判断为痛苦后，取随后的L张照片进行统计。Let the weight of model one be W, and the weight of model two to be 1-W. When the prediction results of both models are pain, the counter sum=1, and the following L images are taken. Each time the dual model predicts pain, The counter sum is incremented by 1. When model 1 judges pain and model 2 judges normal, sum adds W. When model 2 judges pain and model 1 judges normal, sum adds 1-W. When sum is greater than or equal to K, Generate an interrupt and trigger an alarm. Among them, K is the set threshold, and L indicates that after the dual models are simultaneously judged to be pain, the following L photos are taken for statistics.

其中P₁表示模型一判断正确的概率，P₂表示模型二判断正确的概率，a,b分别表示判断正确的图像数量，*表示所有(a,b)可能的集合。显然在区间[0,L]内存在实数a,b使不等式aW+b(1-W)＜K，成立。Among them, P ₁ represents the probability of correct judgment of model 1, P ₂ represents the probability of correct judgment of model 2, a and b respectively represent the number of correctly judged images, and * represents all possible sets of (a, b). Obviously, there are real numbers a, b in the interval [0, L] so that the inequality aW+b(1-W)<K holds.

下面通过两个具体实施方式来说明本发明提供的一种基于双模型概率优化的快速表情识别算法。实验过程中以100张痛苦样本输入系统，模拟持续4秒的痛苦，测试系统能否实现报警，每种算法测试了2000轮次得到数据。A fast expression recognition algorithm based on dual-model probability optimization provided by the present invention will be described below through two specific embodiments. During the experiment, 100 pain samples were input into the system to simulate pain lasting 4 seconds to test whether the system could realize an alarm. Each algorithm was tested for 2000 rounds to obtain data.

实施例1Example 1

一种基于双模型概率优化的快速表情识别算法，A fast expression recognition algorithm based on dual-model probability optimization,

通过opencv提供的库函数，将视频取帧送入Haar人脸级联器，通过分类器检索图像中的人脸，并将人脸裁剪为(224,224)的标准图像，为图像预处理做准备；Through the library function provided by opencv, the video frame is sent to the Haar face cascader, the face in the image is retrieved through the classifier, and the face is cropped into a standard image of (224, 224) to prepare for image preprocessing;

使用CNN7和Mini_Xception模型分别单独对图像进行判断，图像类别分为正常和痛苦两类，在判断过程中记录痛苦为1，正常为0，通过对结果的对比决定下一步骤的执行；Use the CNN7 and Mini_Xception models to judge the images separately. The image categories are divided into two categories: normal and painful. During the judgment process, the pain is recorded as 1, and the normal is 0. The execution of the next step is determined by comparing the results;

步骤4若双模型的输出结果均为痛苦，即CNN7模型和Mini_Xception模型的判定结果相等且都为1时，激活优化算法一，转至步骤5，进行组合概率优化并输出最终结果，反之转至步骤1；Step 4 If the output results of the two models are all painful, that is, when the judgment results of the CNN7 model and the Mini_Xception model are equal and both are 1, activate the optimization algorithm 1, go to Step 5, perform combined probability optimization and output the final result, otherwise go to step 1;

步骤5计数器sum置1，并对此后的L张图像进行统计，并按照优化算法一进行累加，每出现一次双模型均判断为痛苦时，计数器sum加1；Step 5: The counter sum is set to 1, and the following L images are counted, and accumulated according to the optimization algorithm 1, and the counter sum is increased by 1 every time the double model is judged to be painful;

步骤6当在L张图像中sum累加至K时，转至步骤7，反之转至步骤1；Step 6: When sum is accumulated to K in L images, go to Step 7, otherwise go to Step 1;

步骤7产生中断并输出警报，程序将循环打印Warning，直至人工调停，完成对表情动作的判断；转至步骤1。Step 7 generates an interrupt and outputs an alarm, and the program will print Warning cyclically until manual mediation completes the judgment on the facial expression; go to step 1.

实施例一实验结果如表1所示。The experimental results of Example 1 are shown in Table 1.

表1算法一实验结果Table 1 Algorithm-experimental results

Acc表示对表情动作整体的识别准确率。Acc represents the overall recognition accuracy of facial expressions.

实施例2Example 2

步骤5计数器sum置1，并对此后的L张图像进行统计，并按照优化算法二进行累加；Step 5: The counter sum is set to 1, and the following L images are counted, and accumulated according to the optimization algorithm 2;

步骤6当在L张图像中sum累加至大于等于K时，转至步骤7，反之转至步骤1；Step 6: When sum is accumulated to be greater than or equal to K in L images, go to Step 7, otherwise go to Step 1;

实施例二实验结果如表2所示。The experimental results of Example 2 are shown in Table 2.

表2算法二实验结果Table 2 The experimental results of the second algorithm

Claims

1. A quick expression recognition method based on double-model probability optimization is characterized by comprising the following steps:

step 1, sending an image into a camera to take a frame, and cutting out a face image as a standard image;

step 2, carrying out binarization pretreatment on the standard image, and simultaneously carrying out median filtering for reducing interference of invalid features to obtain a binary image;

step 3, respectively sending the standard image and the binary image into a Mini _ Xcenter model and a CNN7 model for parallel judgment, and identifying the facial expression category in the image; the facial expression categories are divided into normal and pain categories, and the pain is recorded as 1 and the normal is recorded as 0 in the judging process;

step 4, if the output results of the dual models are all painful, namely the judgment results of the CNN7 model and the Mini _ Xcenter model are equal and are both 1, activating an optimization algorithm, turning to step 5, performing combined probability optimization and outputting a final result, and turning to step 1 otherwise;

step 5, setting a counter sum to be 1, counting the L images, and performing sum accumulation according to an optimization algorithm;

step 6, when sum in the L images is accumulated to K or is larger than K, turning to step 7, otherwise, turning to step 1;

step 7, outputting an alarm to finish judging the expression action; turning to the step 1;

in the step 4, the probability optimization uses a mathematical model of an optimization algorithm to combine output results of the dual models, and judges the overall expression and action;

the optimization algorithm is selected from the following optimization algorithm I or optimization algorithm II:

(1) optimization algorithm one

When the dual-model prediction results are all painful, the counter sum is 1, the following L images are taken, the dual-model prediction is painful every time, the counter sum is added with 1, when sum is equal to K, interruption is generated and an alarm is triggered, and the formula is as follows:

wherein P is the algorithm assembly power, P₀₀Expressing the probability of pain caused by simultaneous judgment of the double models, K expressing a set threshold value, L expressing that the pain caused by simultaneous judgment of the double models occurs, taking the subsequent L pictures for statistics, and P_startWhen the pain sequence is listed, the probability that one of the previous n photos is suffering from dual-mode judgment is shown, and the formula is as follows:

(2) optimization algorithm two

Setting a CNN7 model as a first model, wherein the weight occupied by the CNN7 model is W, a Mini _ Xconcept model is a second model, the weight occupied by the Mini _ Xconcept model is 1-W, when the dual-model prediction results are all painful, a counter sum is 1, taking subsequent L images, and when the dual-model prediction is painful once, the counter sum is added with 1, when the first model judges that the pain is painful, the second model judges that the pain is normal, the counter sum is added with W, when the second model judges that the pain is painful, the first model judges that the pain is normal, the counter sum is added with 1-W, and when the sum is more than or equal to K, an interrupt is generated and an alarm is triggered; wherein, K is a set threshold value, L represents that a double model appears and is judged as pain at the same time, and then L pictures are taken for statistics;

if a is more than or equal to 0 and less than or equal to L and b is more than or equal to 0 and less than or equal to L, so that the formula aW + b (1-W) < K, the algorithm assembly power formula is as follows:

wherein P is₁Representing the probability that the model one judges correctly, P₂Representing the probability that the model II judges correctly, a and b respectively represent the number of images which judge correctly, and representing all possible sets of (a and b); apparently in the interval [0, L]The presence of real numbers a, b makes the inequality aW + b (1-W) < K true.

2. The method for recognizing the rapid expressions based on the dual-model probability optimization of the facial expression recognition system as claimed in claim 1, wherein the CNN7 model is mainly composed of two parts, the first part is composed of 7 sets of convolution modules, the Conv2D layer, the Dropout layer and the MaxPholing layer are used, and the 7 sets of convolution modules all use Relu as the activation function; the second part consists of a fully-connected layer, the first layer of the fully-connected layer has 128 neurons, the activation function is Relu, the second layer of the fully-connected layer consists of 2 neurons, and the activation function is softmax and plays a classification role.

3. The method for identifying rapid expressions based on bimodal probability optimization according to claim 1, wherein the Mini _ Xconcept consists of 3 parts, the first part is two Conv2D layers, BN is used for reducing overfitting, and Relu is used as an activation function; the second part consists of five dual-channel modules, wherein the left channel consists of a Conv2D layer and a BN layer, the right channel consists of two modules, the first module consists of a depth separable convolutional layer, a BN layer and a Relu layer, the second module consists of a depth separable convolutional layer and a BN layer, and a Max bonding layer is connected for reducing the calculation amount, and finally, the add layer is used for combining the two channels together and sending the two channels into the next module; the third part consists of a Conv2D layer, a Dropout layer and a global average pooling layer, and finally output features are classified by using a softmax activation function.

4. A fast expression recognition system based on double model probability optimization is characterized by mainly comprising four modules,

firstly, a face recognition cutting module cuts a face from an image output by a camera to be used as a standard image;

secondly, the image preprocessing module is used for carrying out binarization on the standard image and carrying out median filtering for reducing interference of invalid features to obtain a binary image;

thirdly, the dual-model prediction module is used for respectively sending the standard image and the binary image into a Mini _ Xcenter model and a CNN7 model for parallel judgment and identifying the facial expression category in the image; the facial expression categories are divided into normal and painful categories, the pain is recorded as 1 in the judging process, the pain is recorded as 0 in the normal condition, and the execution of the next step is determined by comparing results;

fourthly, a combined probability optimization module carries out probability optimization on output results of the Mini _ Xcenter model and the CNN7 model and outputs a final result; the specific process is as follows: judging whether the output results of the double models are painful, if so, setting a counter sum to be 1, counting the L images, and performing sum accumulation according to an optimization algorithm; when sum in the L images is accumulated to a threshold value K or is larger than K, outputting an alarm to finish the judgment of the expression action; otherwise, turning to a face recognition cutting module;

in the combined probability optimization module, probability optimization uses a mathematical model of an optimization algorithm to combine output results of the double models and judge the overall expression and action;

(1) optimization algorithm one

(2) optimization algorithm two

Setting a CNN7 model as a first model, wherein the weight of the CNN7 model is W, a Mini _ Xception model is a second model, the weight of the Mini _ Xception model is 1-W, when the results of the double-model prediction are all painful, a counter sum is 1, taking subsequent L images, when the double-model prediction is painful every time, adding 1 to the counter sum, when the first model judges that the model is painful, and the second model judges that the model is normal, adding W to the counter sum, when the second model judges that the model is painful, and the first model judges that the model is normal, adding 1-W to the counter sum, and when the sum is more than or equal to K, generating interruption and triggering an alarm; wherein, K is a set threshold value, L represents that a double model appears and is judged as pain at the same time, and then L pictures are taken for statistics;

5. The system for fast expression recognition based on dual-model probability optimization as claimed in claim 4, wherein the CNN7 model is mainly composed of two parts, the first part is composed of 7 sets of convolution modules, wherein Conv2D layer, Dropout layer and MaxPholing layer are used, and the 7 sets of convolution modules all use Relu as activation function; the second part consists of a fully-connected layer, the first layer of the fully-connected layer has 128 neurons, the activation function is Relu, the second layer of the fully-connected layer consists of 2 neurons, and the activation function is softmax and plays a classification role.

6. The system of claim 4, wherein the Mini _ Xconcept consists essentially of 3 parts, the first part is two Conv2D layers, BN is used for reducing overfitting, and Relu is used as an activation function; the second part consists of five dual-channel modules, wherein the left channel consists of a Conv2D layer and a BN layer, the right channel consists of two modules, the first module consists of a depth separable convolutional layer, a BN layer and a Relu layer, the second module consists of a depth separable convolutional layer and a BN layer, and a Max bonding layer is connected for reducing the calculation amount, and finally, the add layer is used for combining the two channels together and sending the two channels into the next module; the third part consists of a Conv2D layer, a Dropout layer and a global average pooling layer, and finally output features are classified by using a softmax activation function.