CN112836679B - Fast expression recognition algorithm and system based on dual-model probability optimization - Google Patents
Fast expression recognition algorithm and system based on dual-model probability optimization Download PDFInfo
- Publication number
- CN112836679B CN112836679B CN202110233127.8A CN202110233127A CN112836679B CN 112836679 B CN112836679 B CN 112836679B CN 202110233127 A CN202110233127 A CN 202110233127A CN 112836679 B CN112836679 B CN 112836679B
- Authority
- CN
- China
- Prior art keywords
- model
- layer
- dual
- pain
- optimization
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a rapid expression recognition algorithm and a system based on dual-model probability optimization, which mainly comprise a face recognition cutting module, an image preprocessing module, a dual-model prediction module and a combined probability optimization module; the input image intercepts the face and carries out binarization processing, the face is respectively sent to a dual-model classifier trained by a standard image and a binarized image to be judged in parallel, two optimization algorithms are provided to carry out combined probability optimization on a dual-model output result, and high-accuracy judgment and identification are realized by utilizing probability optimization of two low-accuracy lightweight neural network model identification results. The invention can realize that the pain recognition rate of the duration threshold time and above is not less than 99 percent, greatly reduces the calculation cost and the storage cost, and can also effectively recognize other types of expressions.
Description
Technical Field
The invention belongs to the field of image processing and emotion recognition, and particularly relates to a rapid expression recognition algorithm and system based on dual-model probability optimization.
Background
With the development of society, people pay more and more attention to acute diseases, and because the acute diseases are acute, the symptoms are severe and change quickly, the effective prevention of acute diseases is difficult to achieve, severe pain is often accompanied when the diseases occur, so that the sick people cannot call for help and can not be timely treated, and the lives of patients are greatly threatened. At present, for the identification problem of acute disease, general user attends to, and the video monitoring or install specific sensing device additional at patient's health cause manpower resources waste and bring inconvenience for patient's action, and although current neural network has had very high rate of accuracy, storage and computational cost are higher, and these have all brought the limitation for special type guardianship intelligent terminal's popularization.
Expressions are generally divided into 6 general categories: happiness, sadness, fear, anger, disgust and surprise, facial and facial expressions represent the individual's mental behavior, and facial expressions convey 55% of the information, while only 38% and 7% of the information is related to language and sound, so that pain can be expressed by expression even though it is not conceptualized as an emotion. Conventional expression recognition algorithms generally fall into two categories: firstly, static image classification is carried out, the calculation speed is higher, and the spatiotemporal features of facial expressions cannot be well captured; and secondly, dynamic video classification can be performed, so that the space-time characteristics of the facial expression can be effectively captured, but the calculation is more complex. Most of the models focus on the change of the network structure and the design mode, the accuracy of the models is very high, but the calculation cost and the storage cost are high, and the models are not suitable for the deployment of intelligent terminals with low calculation capacity.
CN111466878A discloses a real-time monitoring method and a real-time monitoring device for pain symptoms of bedridden patients based on expression recognition, wherein the method comprises the following steps: 1, establishing a pain expression training data set; 2, establishing a neural network model for analyzing the pain expression and training to obtain a pain grading model; 3, acquiring three real-time images of the face at the same moment, preprocessing the images, and inputting the preprocessed images into a neural network model to obtain probabilities corresponding to A pain levels; and selecting the pain level corresponding to the maximum probability as the pain level of the detected image at the current moment, and alarming the pain level exceeding the threshold value to realize real-time monitoring. The invention can accurately evaluate the pain in real time and automatically, thereby realizing the effective monitoring of the bedridden patient.
CN106682616B discloses a method for recognizing neonatal pain expression based on two-channel feature deep learning. Firstly, graying a face image of a newborn, and extracting a Local Binary Pattern (LBP) feature map; then, a two-channel convolution neural network is used for carrying out deep learning on the characteristics of two channels of the gray-scale image of the newborn face image and the LBP characteristic image thereof which are input in parallel; and finally, performing expression classification on the fusion characteristics of the two channels by adopting a classifier based on softmax, wherein the expression classification is divided into four expressions of calmness, crying, mild pain and severe pain. The method combines the gray level image and the feature information of two channels of the LBP feature map, expressions such as calm, crying, mild pain, severe pain and the like can be effectively identified, the robustness to the problems of illumination, noise and shielding of the face image of the neonate is good, a new method and a new approach are provided for developing a neonate pain expression identification system, the calculation cost is high, and the deployment of an intelligent terminal is not facilitated.
CN202011036047.5 discloses an intelligent recognition method for the pain expression of the old on a nursing bed, which combines a depth confidence network to extract the pain expression characteristics from a face image, the characteristics can describe the expression more effectively, and simultaneously, aiming at the problem of small sample capacity in pain expression recognition, the generation model is utilized to recognize through a semi-supervised learning method by combining a marked sample and an unmarked sample. The invention focuses on adjusting the structure and parameters of the neural network, so that the painful expressions can be effectively identified, but once the neural network is trained, the accuracy rate cannot be adjusted, which is attributed to uncertainty in the neural network training process.
In order to solve the problem of identification of the acute disease, the invention provides an acute disease identification algorithm which can identify the pain lasting for a threshold time or more according to the expression of a patient and effectively ignore the accidental pain.
Disclosure of Invention
In order to solve the problems, the invention provides an algorithm for continuously judging the expression action peak value by a combined probability optimization algorithm based on two simple models trained by static pictures. The method can quickly and accurately identify persistent pains of which the time is set to be the threshold value and above, and can ignore transient accidental pains.
In order to solve the technical problems, the invention provides a rapid expression recognition algorithm based on dual-model probability optimization, which is mainly based on the combined probability optimization of output results of dual models to realize the recognition of expression actions, and specifically comprises the following technical scheme:
a fast expression recognition algorithm based on dual-model probability optimization specifically comprises the following steps:
step 1, sending an image into a camera to take a frame, and cutting out a face image as a standard image;
sending the video frame to a Haar face cascade device through a library function provided by opencv, retrieving a face in an image through a classifier, cutting the face into a standard image, and preparing for image preprocessing;
step 2, carrying out binarization pretreatment on the standard image, and simultaneously carrying out median filtering for reducing interference of invalid features to obtain a binary image;
graying the standard image and carrying out binarization, and simultaneously carrying out noise reduction on the image by using a median filtering algorithm to reduce the influence of irrelevant features such as beard, spots and the like;
step 3, respectively sending the standard image and the binary image into a Mini _ Xcenter model and a CNN7 model for parallel judgment, and identifying the facial expression category in the image;
the CNN7 model and the Mini _ Xcenter model are used for respectively and independently judging the image, the facial expression category is divided into normal and pain, the pain is recorded as 1 in the judging process, and the pain is recorded as 0 in the normal process;
step 4, if the output results of the double models are painful, namely the judgment results of the CNN7 model and the Mini _ Xscenario model are equal and are both 1, activating an optimization algorithm, turning to step 5, performing combined probability optimization and outputting a final result, and otherwise, turning to step 1;
step 5, setting a counter sum to be 1, counting the L images, and accumulating according to an optimization algorithm;
when the counter sum is set to be 1, recording the judgment result of the subsequent L images, and when the optimization algorithm I is used for working, adding 1 to the counter sum when the double models are all judged to be painful every time; when the optimization algorithm II is used for working, adding W to sum when only the model I judges that the pain is caused, and adding 1-W to sum when only the model II judges that the pain is caused;
step 6, when sum in the L images is accumulated to K or is larger than K, turning to step 7, otherwise, turning to step 1;
when the optimization algorithm one is used, the next step 7 is entered when the counter sum is equal to the threshold K; when the second optimization algorithm is used, entering the next step 7 when the counter sum is greater than or equal to K;
step 7, outputting an alarm to finish judging the expression action; turning to the step 1;
when the output condition is met, the program will print Warning cyclically until manual intervention.
Furthermore, the CNN7 model is mainly composed of two parts, the first part is composed of 7 sets of convolution modules, a two-dimensional convolution layer (Conv2D layer), a Dropout layer and a maximum pooling layer (MaxPooling layer) are used, and the 7 sets of convolution modules all use Relu as an activation function, and the Relu can reduce the calculation cost of the neural network and improve the gradient reduction and back propagation efficiency; the second part consists of a fully-connected layer, the first layer of the fully-connected layer has 128 neurons, the activation function is Relu, the second layer of the fully-connected layer consists of 2 neurons, and the activation function is softmax and plays a classification role.
The Mini _ Xconcept consists essentially of 3 parts, the first of which is two Conv2D layers, both using batch normalization layer (BN) for reduction of overfitting and Relu as activation function; the second part consists of five dual-channel modules, wherein the left channel consists of a Conv2D layer and a BN layer, the right channel consists of two modules, the first module consists of a depth separable convolutional layer, a BN layer and a Relu layer, the second module consists of a depth separable convolutional layer and a BN layer, and a Max bonding layer is connected for reducing the calculation amount, and finally, the add layer is used for combining the two channels together and sending the two channels into the next module; the third part consists of a Conv2D layer, a Dropout layer and a global average pooling layer, and finally output features are classified by using a softmax activation function.
Furthermore, the probability optimization uses a mathematical model of an optimization algorithm to combine output results of the dual models, and makes judgment on the whole expression and action. Therefore, the invention provides two probability optimization algorithms, which can solve the problem. Preferably, the optimization algorithm II is selected and improved, and the relationship between L and K can be well coordinated, so that the false alarm interval and the false alarm rate are reduced as much as possible while the accuracy rate is ensured.
Specifically, the first optimization algorithm is as follows:
when the dual-model prediction results are all painful, the counter sum is 1, the following L images are taken, the dual-model prediction is painful every time, the counter sum is added with 1, when sum is equal to K, interruption is generated and an alarm is triggered, and the formula is as follows:
wherein P is the algorithm assembly power, P00Expressing the probability of pain caused by simultaneous judgment of the double models, K expressing a set threshold value, L expressing pain caused by simultaneous judgment of the double models, taking the subsequent L images for statistics, and PstartWhen the pain sequence is listed, the probability that one of the previous n photos is suffering from dual-mode judgment is shown, and the formula is as follows:
specifically, the second optimization algorithm is as follows:
setting the weight of a first model as W, the weight of a second model as 1-W, when the dual-model prediction results are all painful, setting a counter sum to be 1, taking subsequent L images, and when the dual-model prediction is painful every time, adding 1 to the counter sum, when the first model judges that the pain is painful and the second model judges that the pain is normal, adding W to the counter sum, when the second model judges that the pain is painful and the first model judges that the pain is normal, adding 1-W to the counter sum, and when the sum is greater than or equal to K, generating interruption and triggering an alarm; wherein K is a set threshold value, L represents that the double models appear and the pain is judged at the same time, and then L subsequent images are taken for statistics;
if a is more than or equal to 0 and less than or equal to L and b is more than or equal to 0 and less than or equal to L, so that the formula aW + b (1-W) < K, the algorithm success rate formula is as follows:
wherein P is1Representing the probability that the model one judges correctly, P2Representing the probability that the model II judges correctly, a and b respectively represent the number of images which judge correctly, and representing all possible sets of (a and b); apparently in the interval [0, L]The presence of real numbers a, b in the memory makes the inequality aW + b (1-W) < K, true.
The invention also aims to provide a rapid expression recognition system based on dual-model probability optimization, which mainly comprises four modules,
firstly, a face recognition cutting module cuts a face from an image output by a camera to be used as a standard image;
secondly, the image preprocessing module is used for carrying out binarization on the standard image and carrying out median filtering for reducing interference of invalid features to obtain a binary image;
thirdly, the dual-model prediction module is used for respectively sending the standard image and the binary image into a Mini _ Xcenter model and a CNN7 model for parallel judgment and identifying the facial expression category in the image; the facial expression categories are divided into normal and painful categories, the pain is recorded as 1 in the judging process, the pain is recorded as 0 in the normal condition, and the execution of the next step is determined by comparing results;
fourthly, a combined probability optimization module carries out probability optimization on output results of the Mini _ Xcenter model and the CNN7 model and outputs a final result; the specific process is as follows: judging whether the output results of the double models are painful, if so, setting a counter sum to be 1, counting the L images, and accumulating according to an optimization algorithm; when sum in the L images is accumulated to a threshold value K or is larger than K, outputting an alarm to finish the judgment of the expression action; otherwise, the module I is switched to.
In order to realize automatic identification during acute disease and reduce calculation and storage cost, the invention provides an acute disease identification algorithm and a system for judging according to the painful expression and action peak value of a patient, an input image intercepts the face and carries out binarization processing, the face is respectively sent into a dual-model classifier trained by a standard image and a binary image for parallel judgment, two optimization algorithms are simultaneously provided for carrying out combined probability optimization on a dual-model output result, and the probability optimization of two lightweight neural network model identification results with low accuracy is utilized for realizing high-accuracy judgment and identification. The invention can realize that the pain recognition rate of the duration threshold time and above is not less than 99 percent, greatly reduces the calculation cost and the storage cost, and can also effectively recognize other types of expressions.
Compared with the prior art, the invention has the following beneficial effects and progresses:
according to the invention, on the basis of the judgment of the neural network, the output results of the neural network are optimally combined based on an optimization algorithm, so that high-accuracy recognition of various continuous expressions and actions is realized. On one hand, the invention uses a neural network model with simple structure and an algorithm with concise thought, thereby effectively reducing the calculation cost and the storage cost and having higher calculation speed; on the other hand, a probability optimization algorithm mathematical model is provided, and compared with the uncertainty of a pure neural network, the probability optimization algorithm mathematical model has better controllability. Meanwhile, the invention has high application value in the fields of acute disease alarm, social emotion recognition and the like.
Drawings
FIG. 1 is a flow chart of the structure of the recognition method of the present invention;
FIG. 2 is a schematic structural diagram of CNN 7;
FIG. 3 is a schematic diagram of the Mini _ Xcenter structure;
FIG. 4 is a schematic diagram of a probabilistic optimization algorithm idea.
Detailed Description
The following detailed description of embodiments of the invention is provided in conjunction with the accompanying drawings:
the embodiment provides a rapid expression recognition system based on dual-model probability optimization, which mainly comprises four modules, as shown in fig. 1. Firstly, a face recognition cutting module cuts a face from an image output by a camera to be used as a standard image; secondly, the image preprocessing module is used for carrying out binaryzation on the standard image and carrying out median filtering for reducing interference of invalid features to obtain a binary image; thirdly, a dual-model prediction module is used for recognizing the facial expression category in the image; and fourthly, combining a probability optimization module to perform probability optimization on the dual-model output result and output a final result.
Before using a probability optimization algorithm to perform overall judgment on a model output result, a classification model is required to classify facial expressions, in the embodiment, a standard image and a binary image are respectively sent to a Mini _ Xcenter model and a CNN7 model to perform parallel judgment, and the facial expression categories in the images are identified; the facial expression categories are divided into normal and painful categories, the pain is recorded as 1 in the judging process, the pain is recorded as 0 in the normal condition, and the execution of the next step is determined by comparing results;
(1)CNN7
the CNN7 is mainly composed of two parts, the first part is composed of 7 sets of convolution modules, a two-dimensional convolution layer (Conv2D layer), a Dropout layer and a maximum pooling layer (MaxPooling layer) are used, the 7 sets of convolution modules all use Relu as an activation function, and the Relu can reduce the calculation cost of the neural network and improve the gradient descent and back propagation efficiency. The second part consists of a fully-connected layer, the first layer of the fully-connected layer is provided with 128 neurons, the activation function is Relu, the second layer of the fully-connected layer consists of 2 neurons, and the activation function is softmax and plays a classification role. The network structure is shown in fig. 2.
(2)Mini_Xception
Mini _ Xception consists mainly of 3 parts, the first of two Conv2D layers, both of which use batch normalization layer (BN) for decreasing overfitting and Relu as activation function, the second of five two-channel modules, the left channel consisting of one Conv2D layer and one BN layer, the right channel consisting of two modules, the first of which consists of depth separable convolutional layer, BN layer, Relu layer, the second of which consists of depth separable convolutional layer, BN layer, maxPolling layer is connected for decreasing the amount of computation, finally the two channels are merged together using add layer to be sent to the next module, the third of Conv2D layer, Dropout layer, global average pooling layer, and finally output features are classified using softactivation function, the structure diagram of which is shown in fig. 3.
Based on the judgment accuracy of the two models, a mathematical model of a probability optimization algorithm is established, the output results of the models are combined, and the overall expression and action are judged. The schematic diagram of the algorithm idea is shown in FIG. 4;
(1) optimization algorithm one
When the dual-model prediction results are all painful, the counter sum is 1, the following L images are taken, the dual-model prediction is painful every time the counter sum is increased by 1, and when the counter sum is equal to K, an interrupt is generated and an alarm is triggered. The formula is as follows:
wherein P is the algorithm assembly power, P00Expressing the probability of pain caused by simultaneous judgment of the double models, K expressing a set threshold value, L expressing pain caused by simultaneous judgment of the double models, taking the subsequent L images for statistics, and PstartWhen the pain sequence is listed, the probability that one of the previous n photos is suffering from dual-mode judgment is shown, and the formula is as follows:
(2) optimization algorithm two
And setting the weight of the first model as W and the weight of the second model as 1-W, when the results of the double-model prediction are all painful, setting the counter sum as 1, taking the subsequent L images, and adding 1 to the counter sum every time the double-model prediction is painful, when the first model judges that the first model is painful and the second model judges that the second model is normal, adding W to the counter sum, when the second model judges that the second model is painful and the first model judges that the first model is normal, adding 1-W to the counter sum, and when the sum is more than or equal to K, generating interruption and triggering an alarm. And K is a set threshold value, L represents that the double models appear and pain is judged at the same time, and then L pictures are taken for statistics.
If a is more than or equal to 0 and less than or equal to L and b is more than or equal to 0 and less than or equal to L, so that the formula aW + b (1-W) < K, the algorithm success rate formula is as follows:
wherein P is1Representing the probability that the model one judges correctly, P2The probability of the second model judging correctly is shown, a and b respectively show the number of images judging correctly, and a shows the possible set of all (a and b). Apparently in the interval [0, L]The presence of real numbers a, b in the memory makes the inequality aW + b (1-W) < K, true.
The fast expression recognition algorithm based on dual-model probability optimization provided by the invention is described by two specific embodiments. In the experimental process, 100 pain samples are input into the system to simulate 4-second pain, whether the system can realize alarm or not is tested, and each algorithm tests 2000 times to obtain data.
Example 1
A fast expression recognition algorithm based on double model probability optimization,
step 1, sending an image into a camera to take a frame, and cutting out a face image as a standard image;
sending the video frame to a Haar face cascade device through a library function provided by opencv, retrieving a face in an image through a classifier, cutting the face into a (224 ) standard image, and preparing for image preprocessing;
step 2, carrying out binarization pretreatment on the standard image, and simultaneously carrying out median filtering for reducing interference of invalid features to obtain a binary image;
carrying out graying processing and binarization on the standard image, and simultaneously carrying out noise reduction processing on the image by using a median filtering algorithm in order to reduce the influence of irrelevant characteristics such as beard, spots and the like;
step 3, respectively sending the standard image and the binary image into a Mini _ Xmeeting model and a CNN7 model for parallel judgment, and identifying the facial expression category in the image;
the CNN7 and the Mini _ Xcenter model are used for respectively and independently judging the image, the image types are divided into normal and painful types, the pain is recorded as 1 and the pain is recorded as 0 in the judging process, and the execution of the next step is determined by comparing the results;
step 4, if the output results of the double models are painful, namely the judgment results of the CNN7 model and the Mini _ Xscenario model are equal and are both 1, activating an optimization algorithm I, turning to step 5, performing combined probability optimization and outputting a final result, and otherwise, turning to step 1;
step 5, setting a counter sum to be 1, counting the L images, accumulating according to an optimization algorithm I, and adding 1 to the counter sum when the double models are judged to be painful every time;
step 6, when sum in the L images is accumulated to K, turning to step 7, otherwise, turning to step 1;
step 7, generating interruption and outputting an alarm, and circularly printing Warning by a program until manual mediation, so as to finish judgment of the expression action; go to step 1.
Example one experimental result is shown in table 1.
TABLE 1 Algorithm-Experimental results
Acc represents the recognition accuracy of the entire expression action.
Example 2
A fast expression recognition algorithm based on double model probability optimization,
step 1, sending an image into a camera to take a frame, and cutting out a face image as a standard image;
sending the video frame to a Haar face cascade device through a library function provided by opencv, retrieving a face in an image through a classifier, cutting the face into a (224 ) standard image, and preparing for image preprocessing;
step 2, carrying out binarization pretreatment on the standard image, and simultaneously carrying out median filtering for reducing interference of invalid features to obtain a binary image;
carrying out graying processing and binarization on the standard image, and simultaneously carrying out noise reduction processing on the image by using a median filtering algorithm in order to reduce the influence of irrelevant characteristics such as beard, spots and the like;
step 3, respectively sending the standard image and the binary image into a Mini _ Xcenter model and a CNN7 model for parallel judgment, and identifying the facial expression category in the image;
the CNN7 and the Mini _ Xcenter model are used for respectively and independently judging the image, the image types are divided into normal and painful types, the pain is recorded as 1 and the pain is recorded as 0 in the judging process, and the execution of the next step is determined by comparing the results;
step 4, if the output results of the dual models are all painful, namely the judgment results of the CNN7 model and the Mini _ Xcenter model are equal and are both 1, activating an optimization algorithm I, turning to step 5, performing combined probability optimization and outputting a final result, and otherwise, turning to step 1;
step 5, setting a counter sum to be 1, counting the L images, and accumulating according to an optimization algorithm II;
step 6, when sum in the L images is more than or equal to K, turning to step 7, otherwise, turning to step 1;
step 7, generating interruption and outputting an alarm, and circularly printing Warning by a program until manual mediation, so as to finish judgment of the expression action; go to step 1.
The results of the second example are shown in table 2.
TABLE 2 Algorithm two experiment results
Claims (6)
1. A quick expression recognition method based on double-model probability optimization is characterized by comprising the following steps:
step 1, sending an image into a camera to take a frame, and cutting out a face image as a standard image;
step 2, carrying out binarization pretreatment on the standard image, and simultaneously carrying out median filtering for reducing interference of invalid features to obtain a binary image;
step 3, respectively sending the standard image and the binary image into a Mini _ Xcenter model and a CNN7 model for parallel judgment, and identifying the facial expression category in the image; the facial expression categories are divided into normal and pain categories, and the pain is recorded as 1 and the normal is recorded as 0 in the judging process;
step 4, if the output results of the dual models are all painful, namely the judgment results of the CNN7 model and the Mini _ Xcenter model are equal and are both 1, activating an optimization algorithm, turning to step 5, performing combined probability optimization and outputting a final result, and turning to step 1 otherwise;
step 5, setting a counter sum to be 1, counting the L images, and performing sum accumulation according to an optimization algorithm;
step 6, when sum in the L images is accumulated to K or is larger than K, turning to step 7, otherwise, turning to step 1;
step 7, outputting an alarm to finish judging the expression action; turning to the step 1;
in the step 4, the probability optimization uses a mathematical model of an optimization algorithm to combine output results of the dual models, and judges the overall expression and action;
the optimization algorithm is selected from the following optimization algorithm I or optimization algorithm II:
(1) optimization algorithm one
When the dual-model prediction results are all painful, the counter sum is 1, the following L images are taken, the dual-model prediction is painful every time, the counter sum is added with 1, when sum is equal to K, interruption is generated and an alarm is triggered, and the formula is as follows:
wherein P is the algorithm assembly power, P00Expressing the probability of pain caused by simultaneous judgment of the double models, K expressing a set threshold value, L expressing that the pain caused by simultaneous judgment of the double models occurs, taking the subsequent L pictures for statistics, and PstartWhen the pain sequence is listed, the probability that one of the previous n photos is suffering from dual-mode judgment is shown, and the formula is as follows:
(2) optimization algorithm two
Setting a CNN7 model as a first model, wherein the weight occupied by the CNN7 model is W, a Mini _ Xconcept model is a second model, the weight occupied by the Mini _ Xconcept model is 1-W, when the dual-model prediction results are all painful, a counter sum is 1, taking subsequent L images, and when the dual-model prediction is painful once, the counter sum is added with 1, when the first model judges that the pain is painful, the second model judges that the pain is normal, the counter sum is added with W, when the second model judges that the pain is painful, the first model judges that the pain is normal, the counter sum is added with 1-W, and when the sum is more than or equal to K, an interrupt is generated and an alarm is triggered; wherein, K is a set threshold value, L represents that a double model appears and is judged as pain at the same time, and then L pictures are taken for statistics;
if a is more than or equal to 0 and less than or equal to L and b is more than or equal to 0 and less than or equal to L, so that the formula aW + b (1-W) < K, the algorithm assembly power formula is as follows:
wherein P is1Representing the probability that the model one judges correctly, P2Representing the probability that the model II judges correctly, a and b respectively represent the number of images which judge correctly, and representing all possible sets of (a and b); apparently in the interval [0, L]The presence of real numbers a, b makes the inequality aW + b (1-W) < K true.
2. The method for recognizing the rapid expressions based on the dual-model probability optimization of the facial expression recognition system as claimed in claim 1, wherein the CNN7 model is mainly composed of two parts, the first part is composed of 7 sets of convolution modules, the Conv2D layer, the Dropout layer and the MaxPholing layer are used, and the 7 sets of convolution modules all use Relu as the activation function; the second part consists of a fully-connected layer, the first layer of the fully-connected layer has 128 neurons, the activation function is Relu, the second layer of the fully-connected layer consists of 2 neurons, and the activation function is softmax and plays a classification role.
3. The method for identifying rapid expressions based on bimodal probability optimization according to claim 1, wherein the Mini _ Xconcept consists of 3 parts, the first part is two Conv2D layers, BN is used for reducing overfitting, and Relu is used as an activation function; the second part consists of five dual-channel modules, wherein the left channel consists of a Conv2D layer and a BN layer, the right channel consists of two modules, the first module consists of a depth separable convolutional layer, a BN layer and a Relu layer, the second module consists of a depth separable convolutional layer and a BN layer, and a Max bonding layer is connected for reducing the calculation amount, and finally, the add layer is used for combining the two channels together and sending the two channels into the next module; the third part consists of a Conv2D layer, a Dropout layer and a global average pooling layer, and finally output features are classified by using a softmax activation function.
4. A fast expression recognition system based on double model probability optimization is characterized by mainly comprising four modules,
firstly, a face recognition cutting module cuts a face from an image output by a camera to be used as a standard image;
secondly, the image preprocessing module is used for carrying out binarization on the standard image and carrying out median filtering for reducing interference of invalid features to obtain a binary image;
thirdly, the dual-model prediction module is used for respectively sending the standard image and the binary image into a Mini _ Xcenter model and a CNN7 model for parallel judgment and identifying the facial expression category in the image; the facial expression categories are divided into normal and painful categories, the pain is recorded as 1 in the judging process, the pain is recorded as 0 in the normal condition, and the execution of the next step is determined by comparing results;
fourthly, a combined probability optimization module carries out probability optimization on output results of the Mini _ Xcenter model and the CNN7 model and outputs a final result; the specific process is as follows: judging whether the output results of the double models are painful, if so, setting a counter sum to be 1, counting the L images, and performing sum accumulation according to an optimization algorithm; when sum in the L images is accumulated to a threshold value K or is larger than K, outputting an alarm to finish the judgment of the expression action; otherwise, turning to a face recognition cutting module;
in the combined probability optimization module, probability optimization uses a mathematical model of an optimization algorithm to combine output results of the double models and judge the overall expression and action;
the optimization algorithm is selected from the following optimization algorithm I or optimization algorithm II:
(1) optimization algorithm one
When the dual-model prediction results are all painful, the counter sum is 1, the following L images are taken, the dual-model prediction is painful every time, the counter sum is added with 1, when sum is equal to K, interruption is generated and an alarm is triggered, and the formula is as follows:
wherein P is the algorithm assembly power, P00Expressing the probability of pain caused by simultaneous judgment of the double models, K expressing a set threshold value, L expressing that the pain caused by simultaneous judgment of the double models occurs, taking the subsequent L pictures for statistics, and PstartWhen the pain sequence is listed, the probability that one of the previous n photos is suffering from dual-mode judgment is shown, and the formula is as follows:
(2) optimization algorithm two
Setting a CNN7 model as a first model, wherein the weight of the CNN7 model is W, a Mini _ Xception model is a second model, the weight of the Mini _ Xception model is 1-W, when the results of the double-model prediction are all painful, a counter sum is 1, taking subsequent L images, when the double-model prediction is painful every time, adding 1 to the counter sum, when the first model judges that the model is painful, and the second model judges that the model is normal, adding W to the counter sum, when the second model judges that the model is painful, and the first model judges that the model is normal, adding 1-W to the counter sum, and when the sum is more than or equal to K, generating interruption and triggering an alarm; wherein, K is a set threshold value, L represents that a double model appears and is judged as pain at the same time, and then L pictures are taken for statistics;
if a is more than or equal to 0 and less than or equal to L and b is more than or equal to 0 and less than or equal to L, so that the formula aW + b (1-W) < K, the algorithm assembly power formula is as follows:
wherein P is1Representing the probability that the model one judges correctly, P2Representing the probability that the model II judges correctly, a and b respectively represent the number of images which judge correctly, and representing all possible sets of (a and b); apparently in the interval [0, L]The presence of real numbers a, b makes the inequality aW + b (1-W) < K true.
5. The system for fast expression recognition based on dual-model probability optimization as claimed in claim 4, wherein the CNN7 model is mainly composed of two parts, the first part is composed of 7 sets of convolution modules, wherein Conv2D layer, Dropout layer and MaxPholing layer are used, and the 7 sets of convolution modules all use Relu as activation function; the second part consists of a fully-connected layer, the first layer of the fully-connected layer has 128 neurons, the activation function is Relu, the second layer of the fully-connected layer consists of 2 neurons, and the activation function is softmax and plays a classification role.
6. The system of claim 4, wherein the Mini _ Xconcept consists essentially of 3 parts, the first part is two Conv2D layers, BN is used for reducing overfitting, and Relu is used as an activation function; the second part consists of five dual-channel modules, wherein the left channel consists of a Conv2D layer and a BN layer, the right channel consists of two modules, the first module consists of a depth separable convolutional layer, a BN layer and a Relu layer, the second module consists of a depth separable convolutional layer and a BN layer, and a Max bonding layer is connected for reducing the calculation amount, and finally, the add layer is used for combining the two channels together and sending the two channels into the next module; the third part consists of a Conv2D layer, a Dropout layer and a global average pooling layer, and finally output features are classified by using a softmax activation function.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110233127.8A CN112836679B (en) | 2021-03-03 | 2021-03-03 | Fast expression recognition algorithm and system based on dual-model probability optimization |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110233127.8A CN112836679B (en) | 2021-03-03 | 2021-03-03 | Fast expression recognition algorithm and system based on dual-model probability optimization |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112836679A CN112836679A (en) | 2021-05-25 |
CN112836679B true CN112836679B (en) | 2022-06-14 |
Family
ID=75934433
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110233127.8A Active CN112836679B (en) | 2021-03-03 | 2021-03-03 | Fast expression recognition algorithm and system based on dual-model probability optimization |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112836679B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107358169A (en) * | 2017-06-21 | 2017-11-17 | 厦门中控智慧信息技术有限公司 | A kind of facial expression recognizing method and expression recognition device |
CN107491726A (en) * | 2017-07-04 | 2017-12-19 | 重庆邮电大学 | A kind of real-time expression recognition method based on multi-channel parallel convolutional neural networks |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108108677A (en) * | 2017-12-12 | 2018-06-01 | 重庆邮电大学 | One kind is based on improved CNN facial expression recognizing methods |
CN110633669B (en) * | 2019-09-12 | 2024-03-26 | 华北电力大学(保定) | Mobile terminal face attribute identification method based on deep learning in home environment |
CN111695513B (en) * | 2020-06-12 | 2023-02-14 | 长安大学 | Facial expression recognition method based on depth residual error network |
-
2021
- 2021-03-03 CN CN202110233127.8A patent/CN112836679B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107358169A (en) * | 2017-06-21 | 2017-11-17 | 厦门中控智慧信息技术有限公司 | A kind of facial expression recognizing method and expression recognition device |
CN107491726A (en) * | 2017-07-04 | 2017-12-19 | 重庆邮电大学 | A kind of real-time expression recognition method based on multi-channel parallel convolutional neural networks |
Non-Patent Citations (4)
Title |
---|
Facial expression recognition in videos: An CNN-LSTM based model for video classification;Muhammad Abdullah等;《IEEE Xplore》;20200516;全文 * |
基于多视觉描述子及音频特征的动态序列人脸表情识别;李宏菲等;《电子学报》;20190831;正文第1643-1653页 * |
基于集成卷积神经网络的面部表情分类;周涛;《激光与光电子学进展》;20200731;正文141501-1至141501-12页 * |
构建并行卷积神经网络的表情识别算法;徐琳琳等;《中国国象图形学报》;20190228;正文第226-236页 * |
Also Published As
Publication number | Publication date |
---|---|
CN112836679A (en) | 2021-05-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110188615B (en) | Facial expression recognition method, device, medium and system | |
WO2019033525A1 (en) | Au feature recognition method, device and storage medium | |
CN115131880B (en) | Multi-scale attention fusion double-supervision human face living body detection method | |
CN111488855A (en) | Fatigue driving detection method, device, computer equipment and storage medium | |
CN112101096A (en) | Suicide emotion perception method based on multi-mode fusion of voice and micro-expression | |
CN105956570B (en) | Smiling face's recognition methods based on lip feature and deep learning | |
CN113869276B (en) | Lie recognition method and system based on micro-expression | |
CN114943997A (en) | Cerebral apoplexy patient expression classification algorithm and system based on attention and neural network | |
Angeloni et al. | Age estimation from facial parts using compact multi-stream convolutional neural networks | |
Roy et al. | Ear Biometric: A Deep Learning Approach | |
Ni et al. | Diverse local facial behaviors learning from enhanced expression flow for microexpression recognition | |
CN112836679B (en) | Fast expression recognition algorithm and system based on dual-model probability optimization | |
Kazmi et al. | Wavelets based facial expression recognition using a bank of neural networks | |
Wang et al. | Deep learning (DL)-enabled system for emotional big data | |
Hou | Deep learning-based human emotion detection framework using facial expressions | |
CN115205923A (en) | Micro-expression recognition method based on macro-expression state migration and mixed attention constraint | |
Khan et al. | Evaluating the Efficiency of CBAM-Resnet Using Malaysian Sign Language. | |
Fang et al. | FAF: A novel multimodal emotion recognition approach integrating face, body and text | |
Jain et al. | IoT-based micro-expression recognition for nervousness detection in COVID-Like condition | |
Nidhi et al. | From methods to datasets: a detailed study on facial emotion recognition | |
Kasodu et al. | CNN-based Drowsiness Detection with Alarm System to Prevent Microsleep | |
Agnihotri et al. | Vision based Interpreter for Sign Languages and Static Gesture Control using Convolutional Neural Network | |
Dharanaesh et al. | Video based facial emotion recognition system using deep learning | |
Aslam et al. | Emotion recognition techniques with rule based and machine learning approaches | |
Bhosale et al. | Stress Level and Emotion Detection via Video Analysis, and Chatbot Interventions for Emotional Distress |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |